Lectures 

on 

Modern 

Mathematics 

Volume II 

Edited by T. E. SAA TY 










rt 


L. 

c 

o 

C 

•t 

rt 

C/5 


rt) 

■-* 

y 

t —h 
h** 

n 

C/5 


ST 

7T rD 

2 Q 

G- ri- 

rt C 

"l 

3 rt 

C/5 

S o 

P 3 

rt 

zr 

a> 

3 

p 

rt 

i— * 

o 

C/3 











Lectures 
on Modern 
Mathematics 





Lectures 
on Modern 
Mathematics 


edited by T. L. Saaty, Office of Naval Research 
and Arms Control and Disarmament Agency 

volume II 

John Wiley & Sons, IncNew York • London • Sydney 





Copyright © 1964 by John Wiley & Son$ } Inc . 


All Rights Reserved. This book or any 
part thereof must not be reproduced in 
any form without the written permission 
of the publisher, except for any purpose 
of the United States Government. 

Library of Congress Catalog Card Number: 63-20639 
Printed in the United States of America 


ACCESS’ ON 

N0 - 64317 

CLASS $ 

^ 0 

! 1 23FEB 976 ! I 

- - —« 

O O 

u 

_„ iU 


✓ 

_ 





Preface 


“There is no inquiry which is not finally reducible to a question of 
number 7 ' wrote August Comte. However, the reader who has 
examined the first volume of this series and who leafs through the 
pages of this volume will find that, with some exceptions, the quali¬ 
tative inquiry of modern mathematics itself is not always strictly 
reducible to a quantitative question of number. A greater effort 
is directed in modern mathematics toward structuring and charac¬ 
terizing systems than toward examining their quantitative implica¬ 
tions. Either we must interpret Comte's “question of number" as 
the logical ordering and description of the inquiry in a rigorous man¬ 
ner rather than its measurement, or we must conclude that perhaps in 
his day he could not comprehend what mathematics was to be about. 
Are the other sciences more susceptible to quantitative inquiry? 

There is no more eloquent and symbolic monument than one 
constructed in the spirit of an age. We may hope that the lectures 
manifest this spirit and provide building stones for the monument 
of modern mathematics. 

The many unsolved problems and new challenges should make us 
more imaginative than Comte in delineating its nature for the 
future. 

In the preface to Volume I, I made the following observations, 
which have overall significance for the series, and should perhaps be 
repeated. 


v 




vi Preface 


In our time the growth of mathematical literature, like that of all 
the sciences, has burgeoned beyond the point where an individual 
finds its content accessible. It has been shown that the number of 
scientists and the number of published scientific papers have multi¬ 
plied by ten for every doubling of the general population, and 
hence that the next decade or two will see the doubling of the 
volume of all the existing literature produced until now. The man 
who is not a specialist in a given area of research will find that the 
sheer volume of published work makes him a stranger, and the 
proverbial esotericism of mathematics has by now limited the possi¬ 
bility of universalism to the very few. 

For the future we are promised the applied marvels of data 
processing and information storage and retrieval. Nonetheless, it 
requires more than mechanical assimilation to unify and sift for 
significance the published knowledge of even a relatively small sub¬ 
field of mathematics. We are still obliged (and shall be for a long 
visible future) to look to the mature specialist himself for such 
evaluation, summarization, and interpretation as a partial answer 
to the “paper explosion.” 

This is the second volume of the series of lectures jointly sponsored 
by the George Washington University and the Office of Naval 
Research, which by intelligent summaries further extends the hori¬ 
zons of accessible modern mathematics. 

The lectures, the first six of which appeared in the first volume, 
were begun in the fall of 1962 at the Lisner Auditorium of the 
George Washington University and continued at monthly intervals. 
This volume contains six more lectures, three given in the spring 
and three in the fall of 1963. A third volume for the remaining six 
lectures of the academic year 1963-1964 will follow. 

Our intention, as stated in the preface to Volume I, 

. . . was to invite each of the eminent men represented here to 
delineate a substantial research area, to describe it broadly and 
comprehensively for an audience of mathematicians who are not 
specialists in that area, and to contribute to this description his 
individual evaluation, of the esthetic and practical aspects of the 
field, its position in mathematical development as a whole, and its 
future, as that might be implied in the conjectural exposition of its 
unsolved problems. The speakers have responded to a difficult 





Preface vii 


challenge, compressing the enormous material at their command 
into a necessarily limited space, preserving the original spirit of 
the project in their informal approach, refraining from any deep 
and intricate excursions which might intimidate the tyro, and at 
the same time giving in each case the personal flavor of their own 
involvement in mathematical research. 

The outstanding success of the lecture series makes us confident 
that this book will be of interest to all mathematicians desiring to 
keep abreast of the major achievements in various fields, as well as 
to those in the general scientific public wanting to have a flavor of 
the rapid and sophisticated development of the “Queen of the 
Sciences.” To the graduate student in mathematics embarking 
on his research career it should be both useful and encouraging, 
providing glimpses of large areas to which he may not have been 
formally introduced and challenging him with unsolved problems. 

Our thanks go again to Professor David Nelson for his continued 
and untiring efforts as the executive of this project. 


T. L. Saaty 
Editor 

Washington, D . C. 

June , 1964 





List of Lectures 


VOLUME I (Published 1963) 

P. R. Halmos, A Glimpse into Hilbert Space 

Laurent Schwartz, Some Applications of the Theory of Distributions 

A. S. Householder, Numerical Analysis 

Samuel Eilenberg, Algebraic Topology 

Irving Kaplansky, Lie Algebras 

Richard Brauer, Representations of Finite Groups 

VOLUME II 

L. Nirenberg, Partial Differential Equations with Applications in the Geometry 
Marshall Hall, Jr., Generators and Relations in Groups—The Burnside 

Problem 

R. H. Bing, Some Aspects of the Topology of 3- Manifolds Related to the 
Poincarb Conjecture 

Lars G&rding, Partial Differential Equations: Problems and Uniformization 
in Cauchy's Problem 

Lars V. Ahlfors, Quasiconformal Mappings and Their Applications 
John Milnor, Differential Topology 

VOLUME III (To be published) 

Georg Kreisel, Mathematical Logic 

M. M. Lo6ve, Stochastic Processes 
Paul Erdos, Number Theory 
Einar Hille, Classical Analysis 
H. S. M. Coxeter, Geometry 

Joseph Kamp6 de F6riet, Random Solutions of Differential Equations 
viii 



Contents 


1 Partial Differential Equations with. 
Applications in Geometry, 1 

L. Nirenberg 

2 Generators and Relations in Groups— 
The Burnside Problem, 42 

Marshall Hall, Jr. 

3 Some Aspects of the Topology of 3-Mani- 
folds Related to the Poincare 
Conjecture, 93 

R. H. Bing 

4 Partial Differential Equations: Problems 
and Uniformization in Cauchy’s 
Problem, 129 

Lars G&rding 

5 Quasiconformal Mappings and Their 
Applications, 151 

L. Ahlfors 

6 Differential Topology, 165 

J. Milnor 


IX 







1 

Partial Differential 
Equations with 
Applications in 
Geometry* 

L. Nirenberg 


Lecture I General Linear Partial Differential Equations 

We shall present some of the recent developments in the general 
theory of partial differential equations. A survey of the extensive 
work of the last ten years on equations with constant coefficients 
has been given by Professor L. Schwartz in his lectures, and Pro¬ 
fessor L. G&rding has described current research on the initial value 
problem for hyperbolic equations. Thus we shall not treat these 
topics. Furthermore we shall confine ourselves to linear equations 
since the author has recently published an expository lecture on 
nonlinear equations [45]. Because of lack of space we can consider 
only a few topics. 

In this lecture we consider mainly equations of arbitrary type, and 
treat questions of existence of solutions, locally and in the large, 
uniqueness for the initial value problem, smoothness of the solu- 

* This chapter represents results obtained at the Courant Institute of Mathe¬ 
matical Sciences, New York University, under the sponsorship of the Office of 
Naval Research, Contract No. Nonr-285(46). Reproduction in whole or in 
part is permitted for any purpose of the United States Government. 

1 




2 Lectures on Modem Mathematics 


tions, and describe some of the techniques involved. In the second 
lecture we consider elliptic differential equations as they enter in 
several problems recently studied in differential geometry in 
the large. A number of the topics in this lecture are treated in 
Hbrmander’s book [25] which also presents many results on equa¬ 
tions with constant coefficients. The following books or lecture 
notes also contain much of the recent work on such equations: 
Treves [53], Gelfand, Silov [19], Friedman [15]. 

We consider functions of n variables x — (a; 1 , . . . , x n ) and use 
the standard notation for differentiation: Dj = (1/ i) d/dx 1 , D = 
(D h . . . , D„); for a multi-index a = (a lt . . . , a„), D a = 
Di a 1 • • • D n “ n is an operator of order |a| = 2 otj. We also set 
a! = aila 2 ] • • • a n l and = £i ai • • ■ for a vector f = 
(£i, . . . , £„). A linear differential operator of order m is a poly¬ 
nomial in D with coefficients depending on x, P(x, D). If P' is the 
part consisting of terms of exactly order m, the principal part, then 
P'(x, {) is called the characteristic polynomial. 

We shall use |u|* to denote the C k norm: supremum of derivatives 
of u up to order k in the domain. We shall also use the spaces H, 
with norms 

mi . 2 = / y \D° u \*dx 

l«l <* 

If the domain is the whole space this norm may also be expressed in 
terms of the Fourier transform w(£) = / exp (—ix * £)u(x) dx by 

Ml . 2 = J(i + |f| 2 )*l«Nf 

This form enables us to define the norm for all real $. 

A function defined in 2D will be said to belong to Co°°(£>) if it is (7°° 
and its support is a compact subset of 2D. 

1. AN EXAMPLE WITH CONSTANT COEFFICIENTS 

At the heart of the study of differential equations are inequalities 
related to the differential operators. We shall begin with a simple 
but instructive illustration of their use. Consider the well-known 
inequality of Hormander [20] for operators P{D) with constant 
coefficients. We denote by P (a) (£) the derivative (d/d£ x ) ai • • * 
(d/d£ n ) an P(£) of the polynomial P(£). The inequality asserts that 




Partied Differential Equations; Applications in Geometry S 


there is a constant C depending only on n and m (the order of P) 
such that for all CT functions v with support in a sphere of radius r 
the inequality 

2 \\P (a) v\\ 2 < Cr 2 1| 2 (1 + r 2m ~ 2 ) 

l«l>i 

holds where || || denotes the L 2 norm. Simple proofs may be 

found in [25] or [53], Thus in particular, with a constant c, 

(1-1) Ml 2 < cr 2 ||Pv|| 2 

With the aid of the Hahn-Banach theorem, this inequality easily 
implies the existence of L 2 solutions as distributions of Pv = / for 
every / in C o°°, that is, in C°° with compact support. By a slight 
modification it may also be used to give information in other prob¬ 
lems, such as uniqueness and regularity in the Cauchy problem. 
If we set, namely, v = e n ' x u with v — (vi y - * • , i? n ) a real vector, 
rj • x — 'Lr\jx\ so that e n ’ x Pu = P(D + iq)v , we find on applying 
Hormander's inequality to the operator P(D + irj), 

(1-2) y \\e'- x P M u\\* < cr 2 ||e^P W || 2 

l«l>i 

This yields immediately the fact that if u is a solution of Pu = / 
with u and / having compact support, the convex hulls of the sup¬ 
port of u and / are the same. To prove this we only have to show 
that if u has its support in some half space f • x > 0 and / has its 
support in a smaller half space £ • x > a > 0, then indeed u also has 
its support there. But this follows at once from (1.2) by setting 
V ~ — r£, r > 0 and letting r —> <», since the right-hand side in (1.2) 

isO(<r 2ra ). 

This simple device works also in the purely local problem of 
uniqueness in the Cauchy problem for solutions of 

(1.3) \Pv\ < k y \P a v\ 

l«l>i 

if v has zero Cauchy data on a hypersurface which is strictly convex. 
If v is a solution on the convex side 2D, it vanishes identically near the 
surface. (Observe that the hypersurface may contain character¬ 
istic points.) To prove this, suppose, for simplicity, that the 
surface is that of a sphere, and that the origin lies on the surface. 



4 Lectures on Modern Mathematics 


We wish to prove that v = 0 in some neighborhood of the origin 
(having extended v as zero on the nonconvex side of the surface). 
We may suppose that the inner-directed unit normal to the surface 
at the origin is (1,0,..., 0). Let f (x 1 ) > 0 be a C°° function for 
x 1 > 0 which equals 1 for x 1 < e and vanishes for x 1 > 2e, e being 
small. For e small because of the convexity, the set of points in 3D 
with x\ < 2e may be contained in a sphere of arbitrarily small 
radius r. If we now apply (1.2) to the function u = f v with 
rj — —r( 1, . . . , 0) we find, with ci(e) independent of r > 0, 

T \\e- TXl SP(*M\ 2 < cr 2 \\e- T ^Pv\\ 2 + d(e)e^ 

W>i 

The last term arises from the terms involving derivatives of f, but 
these are non-zero only for x 1 > e. If we now insert (1.3) in the 
right-hand side we find, with some constant c 2 independent of r, 

y < c 2 r 2 y || e^tP^vW* + 

i«l>i I«l>i 

Thus if c is sufficiently small so that c 2 r 2 < % we have 

y \\e~ Txl tP M v\\ 2 < 2ci(e)e~ rt 

In particular then, restricting integration on the left to z 1 < e/2 we 
see that this is possible only if v = 0 for x 1 < e/2. 

The idea of using inequalities with weight functions, as (1.2), in 
proving uniqueness for the Cauchy problem goes back to Carleman. 

John [26] has observed that (1.2) may be used also to prove regu¬ 
larity of a solution v of 

Pv =f, /GC“ 

in such a domain bounded by a strictly convex surface which is non¬ 
characteristic and on which v has vanishing Cauchy data; Malgrange 
[38] also proved this. Hormander [25] has given the most elegant 
proof of this using (1.2), which proceeds as follows. Using the pre¬ 
ceding description, with (1, 0, . . . , 0) normal to the surface at the 
origin, we will prove that v £ C°° in some fixed neighborhood 
x l < e of the origin assuming, for simplicity, that v £ C m . Set 
u = $v as before, Lu = g, and observe that g £ C* for x 1 < e and 
g £ C l for x 1 > e. 



Partial Differential Equations; Applications in Geometry 5 


Denote by U (x 1 , <r), a = (<r 2 , . . . , <r n ) the Fourier transform of u 
in the variables (x 2 , . . . , x n ), 

n 

^(x 1 , a) = J u(x) exp ^ — i ^ xVj^ dx 2 * * • dx n 

2 

then 

f7 = P(Z>i, . . . , D n )w = P(Z>i, o- 2 , , o- n )^ 

Since the boundary is noncharacteristic at the origin (d/d£i) m P(£) 

0; set (d/d£i) j P(£) = P J . T7e now apply (1.2) for a fixed to u as a 
function of the single variable x\ Then 

je~ 2rxl (\a\ 2 + 2|P^| 2 ) dx 1 < c'r 2 f\e~ r * l g\ 2 dx 1 

for arbitrary real r. We may choose r as an arbitrary function of 
cr; with a 2 = 2 a 2 set e 2r = (1 + o 2 ) p , for some p > 0. On multi¬ 
plying both sides of the resulting inequality by (1 + a 2 ) ep we find 

(1.4) f (1 + \u\ 2 + y \p j v\ 2 ) dx 1 

3 

< c'r 2 / (1 + <r 2 ) p( - t ~ xl) \g\ 2 dx 1 

Now observe that since g belongs to CT for x l < e and to C m for 
x 1 > €, the integral of the right-hand side with respect to a is con¬ 
vergent (by Parseval) for any p > 0. Hence this is true also for 
the term on the left, that is, 

J7(l + a 2 ) p ^ xl) (\u \ 2 + X\P^\ 2 dx l da < oo 

Thus in particular, restricting integration to x l < c/2, where u = v } 
we have 

// (1 + <r 2 ) p,/2 (H 2 + X |P^| 2 ) dx 1 da < oo 

But since p > 0 is arbitrary, this means (by Parseval) that in 
x 1 < e/2 the function v and its x 1 derivatives up to order m — 1 
have square integrable derivatives in the variables (x 2 , . . . , x n ) 
of all orders. The differential equation Pv = / enables us to express 
(d/dx l ) m v in terms of these other derivatives of /, and hence it too 
has square integrable derivatives in the variables (x 2 , . . . , x m ) of 
all orders. By repeated differentiations of the equation we may 
express further x 1 derivatives in terms of derivatives which we know 
are square integrable, and thus we find that v has square integrable 



6 Lectures on Modem Mathematics 


derivatives in all variables of all orders for x 1 < e. But then 
v e c 00 . 

Observe that the procedure is to prove first the regularity of the 
solution in directions tangential to the boundary and then to infer 
regularity in all variables. This approach is used repeatedly in 
proving regularity for various boundary value problems. 

We should remark that Treves [52] (see also [53]) has given an 
extension of Hormander's inequality which is also more useful: for 
functions u of compact support, 

(1.5) t 2a || exp (tjx j ) 2 }p (a) u || 2 < constant || exp a (tjx j y}pu [| 2 
i 

where the constant depends only on m and is independent of the 
support of w. 

2. EQUATIONS WITH VARIABLE COEFFICIENTS 

We begin with the question of existence of solutions of equations 

(2.1) P(x, D)u = f 

with (complex) variable coefficients. It is of course desirable to 
make the solution unique by finding appropriate well-posed bound¬ 
ary or initial conditions. It is also of interest, however, to know 
whether any solutions exist at all. As we have remarked in the 
previous section, this is always so in the constant coefficient case 
with / £Co°°« Indeed for constant coefficients the equation has 
been studied in great detail for many classes of functions /. 

As we have remarked, the existence of solutions is intimately tied 
up with estimates. For instance suppose that P* is the formal 
adjoint of the operator P, defined in terms of the scalar product 
( , ) by 

(Pm, v) = (m, P*v) 

where one of the functions u } v has compact support. If we can 
establish an a priori estimate of, say, the form—for all functions 
u E C o°°(£>) (or satisfying some less restrictive boundary conditions) 

(2.2) N|, < C\\P*u\\k 

for some 5 and k and constant C —then this implies by a functional 
analytic argument using the Hahn-Banach theorem that Pu = / 
always has distribution solution u in 2) (satisfying in some sense 



Partial Differential Equations; Applications in Geometry 7 


adjoint boundary conditions) for every / £ —actually for a 

wider class of functions /. Conversely one sees [25] with the aid of 
the closed graph theorem that existence of distribution solutions of 

(2.1) in 3D for every / £ Co*(3D) implies the following inequality: if 
12 is an open subset of 2D with compact closure in 3D, there are con¬ 
stants C, k , N such that 

(2.3) |(/, v)\ < C|/|*|P*y|]v for all/, v £ C 0 ”(n) 

There are several techniques for obtaining estimates of the form 

(2.2) . (i) Energy integral methods in which we consider the L 2 

scalar product of ( Pu , Mu), where M is a differential operator 
chosen in such a way that after suitable use of Green's theorem 
(making use of the boundary conditions) we obtain the integral of a 
positive quadratic Q(u, u) of u and some derivatives. Thus 

| (Pu, Mu )| > Q(u, u) + B(u, u) 

where B(u, u) is a similar boundary integral which is definite in 
virtue of the boundary conditions, (ii) Fourier transform and 
Parseval's theorem; this may also be used in proving positive 
definiteness of integrals of quadratic forms arising in (i). (iii) 
Integral representation of solutions via integral operators—often 
obtained from the fundamental solution or an approximate funda¬ 
mental solution. 

It was believed that every differential equation (2.1) with / £ C* 
possessed solutions, at least locally, but in 1957 Lewy [35] presented 
a first-order operator P (with analytic coefficients) such that 
Pu — f has no distribution solution in any open set for most C* 
functions/. Hormander [23] (see also [25]) then derived necessary 
conditions for solvability of an operator P. We say that P is a 
solvable at a point xq if there is an open neighborhood 12 of x o such 
that for every / £ Co* (12) the equation Pu = f has a distribution 
solution in 12. His condition is expressed in terms of the com¬ 
mutator [P\ P f ] of the principal part P f of P and its complex 
conjugate P r obtained by conjugating all coefficients; this commu¬ 
tator is a differential operator of order < 2m — 1. Denote by 
C{x, f) the characteristic polynomial of its leading part, that is, of 
terms of the order 2m — 1. Then Hormander's necessary condition 
for solvability at xo is: 

(H) For x in a neighborhood of x 0 , every real root f of P'(x, £) is 
also a root of C(x, £). 



8 Lectures on Modem Mathematics 


In proving the necessity of (H), Hormander first showed that (2.3) 
is a necessary condition and then, assuming that ( H ) does not hold 
at Xo he constructed a family of functions (in any given neighbor¬ 
hood of x 0 ) depending on a parameter r, f T , v T G C 0 "(fi) such that 
as t —> oo these functions violated (2.3). The construction of v T , an 
approximate solution of L*v = 0, is a bit involved. 

For first-order operators P, Nirenberg and Treves [46] have found 
both further necessary conditions and also sufficient conditions 
which agree if the leading coefficients of P are analytic. These 
conditions may be expressed in terms of the successive commutators 
Ck — [P\ Ck~ i], k = 2, 3, . . . , Ci = [P', P'] which are all first- 
order operators. At any point x let /co(x) be the first value of k, if 
finite, such that C/fx, £) does not vanish at every real root £ of 
P r (x, £). Then the necessary and sufficient conditions for P to be 
solvable at x<> is: 

(P r ) At every point x in some neighborhood of x 0 , k 0 (x), if finite, 
is even. 

This condition may also be expressed in terms of the bicharacter¬ 
istics of the real and imaginary parts of P'{x, £). Let A (x, |) and 
B(x, £) be the real and imaginary parts of P'(x, £). 

Definition. A bicharacteristic of A{x, £) is a curve in the real 
(x, 0 space satisfying the differential equation system (where • 
denotes differentiation along the bicharacteristic) 

( 2 . 4 ) ± t = ± A(X)0> 

We see immediately that A is constant along any bicharacteristic. 

Condition (P') (for analytic coefficients) is equivalent to the 
condition: B does not change sign on any bicharacteristic of A on 
which A vanishes, and also A does not change sign on any bicharac¬ 
teristic of B on which B vanishes. 

The proof of necessity of (P') is simply an extension of Hormand- 
er’s necessity proof, whereas sufficiency is proved by establishing 
an estimate (2.2) with s = 0, k = 1; the argument is slightly tricky. 

We should remark that an operator may be solvable at every 
point in a domain without being globally solvable in the domain. 
An analogous situation maintains even for operators with constant 
coefficients. Malgrange [37] has shown that the equation P(D)u =/ 



Partial Differential Equations; Applications in Geometry 9 


is always solvable in a domain 3D for every C°° function / in 3D (with 
u a distribution in 3D) if and only if 3D is P-convex. To be P-convex 
means that to any compact set K\ C 3D there is another compact 
set K 2 C 2D such that any distribution u with compact support in 
3D and with support P( — 2))w in Ki has its support in A 2 . (Since 
the convex hull of supp P( — D)u agrees with that of u it follows 
that any bounded convex set is P-convex for every operator P with 
constant coefficients.) Another way of saying that 3D is P-convex is 
to say that uniqueness for the Cauchy problem for P( — D) holds in 
some neighborhood of the boundary, that is, if u is defined on an 
open neighborhood of the boundary of 3D, satisfies P( — D)u = 0, 
and vanishes outside 3D, then u vanishes on a neighborhood ft 2 of the 
boundary, where ft 2 depends only on Oi and not on the particular 
solution u . 

Hormander [24] has completely characterized the P-convex 
domains for n = 2, and has given a sufficient condition for n > 2 
which we may describe. Assume that d3D is a C 2 hypersurface. 
Suppose that at every point where d3D is characteristic with respect 
to P the curvature of the hypersurface in the direction of either 
Re grad P'(N ) or Im grad P'(N) is positive (here N is the unit 
normal to d3D at the point). Then 3D is P-convex. 

Thus instead of assuming convexity of the boundary, as in Section 
1, we assume only convexity of the boundary along a particular 
bicharacteristic of either Re P' or Im P'. 

The idea of the proof is easily sketched. We must prove unique¬ 
ness of the Cauchy problem for P( — D)u — 0 in some neighborhood 
of the boundary. The argument is local. At a point where the 
boundary is not characteristic we obtain a neighborhood of the 
point where u = 0 by Holmgren 7 s theorem. At a point xo where 3D 
is characteristic, Hormander makes clever use of the hypothesis to 
construct a family of hypersurfaces (or rather pieces of hyper¬ 
surfaces) whose boundaries lie outside 3D and which together cover 
a neighborhood of xq and such that none of them contains any charac¬ 
teristic points . He then employs Holmgren again to conclude that 
u = 0 in a neighborhood of xq. This material is also contained in 
[25] and [53]. 

In Chapter 8 of his book [25] Hormander has obtained much 
more general results than those derived in Section 1 from (1.2), 
and has also extended them to operators with variable coefficients. 
He treats global existence problems and uniqueness and regularity 


10 Lectures on Modem Mathematics 


in the Cauchy problem. In this work the bicharacteristics play a 
very essential role. 

The operators he considers are either elliptic or belong to a class 
termed principally normal. P is principally normal if there is a 
differential operator Q(x, D) homogeneous of order m — 1 in D such 
that the leading part C of the commutator [P f , jp'] may be expressed 
as 

(2.5) C(x, Q = P' Or, i)Q( Xf Q + Q(x : , {)/*(*, Q 

In view of the necessary condition ( H ) this is not an unreasonable 
hypothesis. 

In addition, Hormander introduces the important notions of 
pseudoconvexity of a hypersurface and strong pseudoconvexity. 
To explain these notions suppose first that P'(x, £) has real coeffi¬ 
cients and suppose that the x projection of a bicharacteristic curve 
given by (2.4) on which P f vanishes is tangent to it at a point xo , 
that is, if the surface is given by \fs(x) = ^(zo) with grad yp{x 0 ) ?£ 0, 
we suppose 

(2.6) P'( XOf £)—0 

If • denotes differentiation with respect to the parameter on the 
bicharacteristic, we see that on the curve \j/(x 0 ) — 0 whereas (using 
summation convention) 

$(xo) = *l/x’x k x 3 x k + 

= txWP'tiP* + MP'us>Pu ~ 

in virtue of (2.4). Pseudo convexity is the requirement that ^ be a 
convex function on such a curve at such a point; in fact it requires 
that $(xo) > 0. Thus a bicharacteristic tangent to the surface 
must lie on the side where ^ > \//(xo ). More generally for complex 
coefficients the definition is as follows: 

Pseudoconvexity . An oriented hypersurface $(x) = yp(x 0 ) is pseudo- 
convex with respect to P at a point x 0 if at x 0 

(2.7) * x , x >P' ti P' (k + * x i Re (P; ixt P' (> - P'^Pj) > 0 
for every real £ ^ 0 satisfying (2.5). 



Partial Differential Equations; Applications in Geometry 11 


He then proves the following global existence theorem (we are 
omitting the precise smoothness condition on the functions). 

Theorem 1. Suppose that P is a principally normal operator in a 
domain 3D in which there is a real C 2 function \p with grad \p ^ 0, all 
of whose level surfaces are pseudoconvex. Let ft be an open set 
with compact closure in 3D and assume that / is a function in 3D 
which is orthogonal in the L 2 sense to all solutions v of the adjoint 
problem P*v = 0 having compact support in ft. Then there is a 
solution in ft of Pu = /. Furthermore if / £ C°° there is a <7°° 
solution. 

In addition he proves the following regularity theorem in the 
Cauchy problem. 

Theorem 2. Suppose that P is principally normal and that \p is 
defined in a neighborhood of xo, grad \p 9 ^ 0, with its level surface 
through xo pseudoconvex there. If u is a distribution solution of 
Pu =/ in a neighborhood of xo with / £ and w £ on the 
side of the surface where \p > p(xo) } then i/gC" in a whole 
neighborhood of xo. 

The first theorem is proved by deriving estimates similar to the 
estimates (1.2), but with weight factor e r4> where <p is a nonlinear 
function closely related to p. The second theorem is proved by a 
suitable analogue of (1.5). The details of the proof are too long to 
be described here. 

We remark that the pseudoconvexity condition in the second 
theorem is essential. Following a construction of Zerner [55], 
Hormander [25] constructs a solution of an equation Pu = 0 
with constant coefficients which is C°° everywhere except on a 
bicharacteristic. 

Strong pseudoconvexity imposes an additional condition at the 
nonzero double roots in r of P(x 0 , £ + ir grad ^(x 0 )). Assuming 
strong pseudoconvexity he proves uniqueness for the Cauchy 
problem near Xq for zero data on the surface ip = ^fr(xo). This 
result generalizes the previous results by Calderon [8] and by 
Hormander [22] in which double roots are excluded. Examples of 
Cohen [11] show that the condition cannot be omitted. The 
method of proof yields not only uniqueness in the Cauchy problem 
but also estimates for the solution. In general, of course, a unique¬ 
ness theorem should always provide estimates for the solution. We 



12 Lectures on Modern Mathematics 


remark that John [26] has adapted the Holmgren uniqueness 
theorem for analytic equations so as to yield estimates for the 
solution. 

3. BOUNDARY VALUE PROBLEMS 

Up to now we have considered differential equations without 
imposing boundary conditions. For equations of well-determined 
“type,” elliptic, hyperbolic, etc. boundary value problems have been 
studied over many years. Equations of arbitrary (or unknown) 
“type,” however, have only been considered recently. Friedrichs 
[16] has developed an existence theory of boundary value problems 
for first-order systems which are symmetric and positive, that is, 
which have the form (real coefficients): 

(3.1) Lu = (J^A<£ j + A)u-f 

where u = (u 1 , . . . ,u N ) is a system of N real functions, f = 

( Jj • • - the A 3 and A are square matrices, the A j being 

symmetric; the positivity is expressed by the requirement that the 
matrix 

(3.2) K = A - 1 V ~ A 3 

Lj dx 3 

have positive definite symmetric part. Friedrichs’ results have 
been extended and simplified by Lax and Phillips [34] (see also 
references there to Phillips), and further results have been obtained 
by Sarason [48]. 

Before describing the boundary conditions we observe that, by 
the symmetry of the A 3 (using summation convention), 

uLu = i(uA 3 u) x > + uKu 
Thus by Green’s theorem 

( 3 - 3 ) ( Lu > u ) = (Ku, u) + uA n u ds 

where £> denotes the boundary of the domain 2D and ds denotes its 
element of area; A n = in,A 1 , where nj denotes the j'th component 
of the outer normal at the boundary. 





Partial Differential Equations; Applications in Geometry IS 


In describing the boundary conditions we suppose that the 
boundary i> is decomposed into several smooth parts IT, . . . , r*, 
and that on each Fj there is a linear subspace Nj(x) varying 
smoothly with x of fixed dimension ray. We then require that for x 
in r j the boundary values of u lie in Nj(x). These boundary condi¬ 
tions are called positive if at every point x of Fj the operator A n is 
positive over Nj(x), that is, uA n u > 0 for all u(x) in Nj(x), j = 1, 
We see from (3.3) that if u satisfies positive boundary 
conditions and if the system is symmetric positive, the following 
energy estimate holds. 

(3.4) ( u , u) < constant ( Lu , u) 

To solve such a boundary value problem we should ensure that 
there are not too many boundary conditions (for we may be able to 
enlarge some space N j and still maintain positivity). Thus we 
assume that each Nj is maximal positive (so in particular it contains 
the null space of A n at each point x £ Fj). It then follows that 
the adjoint boundary conditions are positive for the adjoint oper¬ 
ator L*. These boundary conditions are expressed by the require¬ 
ment that u lie in Pj(x) for x £ Fj where Pj is the orthogonal 
complement of A n (x)N(x ); clearly we have 

(Lu, v ) = ( u, L*v) 

if u and v satisfy the boundary and adjoint boundary conditions. 
Note that the dimensions ray depend strongly on the nature of the 
operator A n . 

By using the projection theorem in Hilbert space, it is not difficult 
to establish the existence of weak solutions of such symmetric posi¬ 
tive problem with positive boundary conditions. But weak solu¬ 
tions are not necessarily unique, and the real work comes in estab¬ 
lishing the regularity of the solutions. Uniqueness for regular 
solutions follows from (3.4). Under an additional condition 
Friedrichs proved sufficient regularity of the weak solutions to 
guarantee uniqueness. Lax and Phillips [34] have presented a 
simpler and more general regularity theorem near the boundary for 
rather general first-order systems not necessarily symmetric (see 
also [48]). The argument is local; by first smoothing the solution 
in directions tangent to the boundary they demonstrate regularity 
in the tangential directions and then derive general regularity. 




H Lectures on Modem Mathematics 


Friedrichs has shown that many problems of mathematical 
physics may be expressed by symmetric positive systems (although 
often it is not trivial to do this) and may be treated by his theory. 
For instance he treats the Tricomi problem involving an equation 
which is elliptic in some region and hyperbolic in another. 

This situation in which the main difficulty is proving regularity 
of the solution is typical. The general problem of finding under 
which boundary conditions solutions of a given differential equation 
are necessarily regular has not been completely solved. The case 
of a single equation with constant coefficients for one function in a 
half space restricted on the boundary by differential boundary 
conditions, also having constant coefficients, has been solved by 
Hormander [21]. In general the results will depend on the lower- 
order terms (not just the highest) and on the nature of the boundary 
(see the problem of Section 7). 

4. SINGULAR INTEGRAL OPERATORS 

We conclude this lecture by describing the construction and basic 
properties of a class of operators introduced by Calderon, Zygmund 
[10], and Mihlin [39]. These include differential operators, in the 
sense that any differential operator may be expressed in terms of 
these and the Laplace operator. These operators have already 
proven to be a very useful and important tool, and will continue 
to be very useful. For further references we may mention also [9], 
[51], [50], and [33], We shall restrict ourselves to operators on 
scalar valued square integrable functions defined in all of space. 

The operators are a special algebra of operators mapping the 
space of Z/ 2 functions continuously into itself (we may also consider 
H s into itself), which are considered only modulo the ideal K of 
compact operators. Furthermore they are local operators in the 
sense that if T is such an operator and if u , v , are L 2 functions with 
supports in fixed disjoint sets then ( u , Tv) is compact in u and in v . 

We start with the following special operators. 

1. Multiplication by a C°° function a(x) having a limit at infinity. 

2. Multiplication of the Fourier transform #(£) of u(x ), 

a(0 = (2ir)- n/2 J e ix l u(d) dx = Fu 

by a function <*(£) which is homogeneous of degree zero and C°° on 
the unit sphere jf| — 1. 





Partial Differential Equations; Applications in Geometry 16 


These operators are clearly continuous maps of L 2 into itself. 
We shall also have need of the operator A, the square root of the 
Laplace operator, which may be defined, in terms of the Fourier 
transform operator F by 

(4.1) A = F -1 |{|F 

Here ||| denotes multiplication by |$|. The operator A is defined 
on a dense set in L 2 , for instance the CT functions dying down 
rapidly at infinity. Let us in fact confine ourselves to these for 
the moment. 

The operators of 1 and 2 which we denote simply by o, a have 
the important property that their commutators are compact oper- 
ators. In fact the operators 

(^•2) A [a, a], [a, a] A 

are bounded as maps of L 2 into L 2 . 

The algebra of operators we will consider will be generated by the 
special operators of the type 1 and 2. Consider first finite sums 


N 



1 


With such a sum we associate the symbol 

h(x } f) = 2<Zy(x)ofy({) 

According to the preceding these form an algebra modulo K. 
Naturally we want to complete this algebra by taking its closure in 
the norm sense. We would expect that to every “symbol/ 7 that is, 
a function h(x, f) which is homogeneous of degree zero in { and 
sufficiently regular say of class C k , k = k(n) in (x, £) (for |$| = 1), 
and having a limit (depending on £) as z—> <*, it is possible to 
define a corresponding bounded operator H of L 2 into L 2 mod K . 
Indeed this is the case for k sufficiently large and may be shown by 
an expansion of A in the form h = Say(x)ay(f). Furthermore, the 
commutator of any two operators G , H obtained in this way is again 
compact, and in fact A [G, H], [G, H] A are bounded operators. 

We may now represent any differential operator (with smooth 
coefficients having limiting values at «) in a special way. Suppose 
the operator is given in the form P = 2a a D*. The operator D a 



16 Lectures on Modem Mathematics 


may be written as 


D a = p-1 p F = p-l 



S (ot) A |aI 


where S (a) is here defined. Then we may write 
P = Sa« E (a) A W 

- 2 A ,a, a a S (a) (mod K) 


in virtue of the preceding. If P' is the principal part of P, of order 
m, then P f = 2a tt H (a) A w ; thus we associate to P the singular 
integral operator 


2a a E (a) with symbol 2a a (x)£ a = P'(x , £) 


Returning to the algebra of operators obtained from (smooth) 
symbols, h(x, £) it is shown that the mapping: algebra of symbols —> 
algebras of operators (mod K ), is one-to-one and so an algebraic 
homomorphism. Also for any smooth function u{%) dying down 
rapidly at oo we have the representation 

(Hu)(x) = (2w)~ n je ix *h(x, {)«({) 

and the representation as a singular integral operator with a kernel 
k: 

(Hu)(x) = fk(x,x — y)u(y) dy 

where k(x, z ) is homogeneous in z of degree — n and regular in ( x } z) } 
for l^j = 1. Furthermore, the norm of the operator H (mod K) 
turns out to be sup \h(x, £)|. Therefore, taking closure we see that 
we can assign to any continuous symbol h(x, £) an operator (mod K ) 
(i.e., an equivalence class of bounded operators of L<i —> L% modulo 
K ). 

Elliptic Operator. We observe that if a symbol h(x f £) does not 
vanish, the corresponding operator H has an inverse H ~ 1 (mod K ) 
defined by the symbol h~ l (x , £). Thus such an operator H has a 
finite dimensional null space. Furthermore, its range is closed and 
has finite codimension. (The converse is also true; these properties 
of the null space and the range imply that h does not vanish.) 
Hence it has a finite index: 


Index = dimension of null space — codimension of range 




Partial Differential Equations; Applications in Geometry 17 

In analogy with differential operators the corresponding operator is 
called elliptic. 

The representation of differential operators by means of singular 
integral operators was used by Calderon in his study of the Cauchy 
problem [8] and in treating hyperbolic equations [9]. They have 
also been used recently in elliptic problems (see Section 8) where 
suitable, extensions to manifolds are used. More recently, in a very 
interesting paper,* Calderon has proved, with their aid, existence 
theorems in the large for very general systems of partial differential 
equations. The demonstrations involve new uniqueness theorems 
for the Cauchy problem in a half-space. 


Lecture II Elliptic Differential Equations 
in Differential Geometry 

We shall take up three geometric-analytic topics that are con¬ 
cerned with or make use of recent developments in elliptic partial 
differential equations. These are: 

§6. Work of Kodaira, Spencer, and Kuranishi on existence and 
deformation of complex structure on compact manifolds. 

§7. The 5-Neumann problem; work of Kohn and Morrey. 

§8. The index problem for elliptic operators on compact mani¬ 
folds; work of Atiyah, Singer. 

In Section 6 we describe the problems and attempt to give the 
structure of the arguments used. In the other sections the results 
and techniques are too involved and we can only give the merest 
hint about the nature of the analytic methods employed. 


5. PRELIMINARIES 

As a well-known illustration of the use of elliptic equations in 
global differential geometry, let us recall some basic facts of the 
Hodge-Kodaira theory. Let V be a compact Riemannian manifold. 
The Riemannian metric enables one to define S the adjoint of the 
exterior differential operator d acting on exterior differential forms. 

* A. P. Calderon, “Existence and uniqueness theorems for systems of partial 
differential equations,” Fluid Dynamics and Applied Mathematics Proceeding 
of Symposium at Institute for Fluid Dynamics and Applied Mathematics, 
University of Maryland, April 1961. Gordon and Breach (1962) 147-195. 




18 Lectures on Modem Mathematics 


The operator d + 8 on all forms is elliptic. Also the Laplace 
operator 

A = c25 -b 5c£ 

which preserves the degree is strongly elliptic, and one has the 
Hodge-Kodaira decomposition theorem: there exists a Green’s 
operator G commuting with d and 8 such that 

I = H +AG = H + d8G+8dG 

where H is the projection onto the finite dimensional space of 
harmonic forms </> (solutions of A <j> = 0). Thus any form can be 
decomposed uniquely into a harmonic form +d of a form +5 of a 
form. (Furthermore the space of harmonic forms of degree p has 
finite dimension equal to the pth Betti number of V .) This 
decomposition formula, derived by Hilbert space arguments, may 
be extended in a similar way to any strongly elliptic operator or 
system of the form 

A = d8 + §3 

if 5 2 = 0 and 8 is the adjoint of 3. 

We turn now to some technical material on elliptic boundary value 
problems which is used in the subsequent sections. Consider an 
elliptic system of N equations for N functions u =* (tti, . . . , un) 
of n variables x — (a; 1 , . . . ,x n ). We use summation convention 

(5.1) lij(D)uj = /*, i ® 1, . . . , N 

where the Uj are differential operators of, for simplicity (see how¬ 
ever [13] and [2], part II), the same order m. Ellipticity means 
that the characteristic matrix Z^*({) of leading terms is nonsingular 
for real £ = (f x , . . . , £ n ) ^ 0; if n > 2, this implies that mN is 
even, and we shall always assume this to be the case. If the 
quadratic form associated with the characteristic matrix is positive 
definite for real £ ^ 0 the system is called strongly elliptic. For 
functions u defined in a bounded domain we also impose mN /2 I 
differential boundary conditions 

mN 

(5.2) Haj^D^Uj = <f>a) a ij * * • ) 2 


of orders <m (for simplicity). 



Partial Differential Equations; Applications in Geometry 19 


The boundary conditions are said to be coercive in some norm 
|| || [such as Z/ 2 , L P) or |] || t norm, or Holder norm | | a defined 
by |/|a = sup |/| + sup (\f(x) — f(y)\/\x — y\ a )] if the following 
norms are equivalent (where || || a are norms for functions defined 

on the boundary): 

(5.3) 2 2 X iim + X ill'll + 2 n*^n« 

i |a|<m x j a 

Then we can estimate all the derivatives occurring in UjUj in terms 
of the UjUj. It is easy to estimate all such derivatives in compact 
subsets of the domain, but these estimates hold up to the boundary 
if and only if the boundary operators satisfy known algebraic condi¬ 
tions. This problem has been treated in detail for various norms 
by Agmon, Douglis, and Nirenberg [2], and Browder [7]. The more 
general question, initiated by Aronszajn, when such estimates hold 
in Z/ 2 for any number of differential operators (not just N ) and any 
number of boundary conditions, has been solved by Agmon [1] and 
Hormander for N = 1 and by Figueiredo [14] and Hormander 
for arbitrary N. HormandeFs book [25] also contains a treat¬ 
ment of the existence theory. Lions and Magenes [36] (where other 
references may be found) have made extensions of the results of [2] 
using the theory of interpolation of operators in Banach space; see 
also Schechter [49]. Many recent developments in elliptic theory 
are contained in the lecture notes [47] by Peetre, where other refer¬ 
ences to existence theory can be found. 

Under the conditions for coerciveness the elliptic operator (with 
these boundary conditions) has finite dimensional s null space, and 
closed range with finite codimension r, so that the index s — r is 
well defined. Furthermore, if the spectrum is not the whole plane 
it is discrete. The estimates described here are used in Sections 6 
and 8. In particular, in Section 6 where a nonlinear problem 
occurs, the estimates with special norms play an important role. 
The spaces with these norms are such that the products of two func¬ 
tions in the space belongs again to the space. This is true for the 
norms | that is, derivatives up to order k satisfying a Holder 

condition with exponent a, and for the spaces H 8f for s > n/ 2. 

Certain boundary value problems arise in practice where the 
conditions for coerciveness are not satisfied, but where nevertheless 
the null space is finite dimensional and the range is closed. In such 
a case the lower-order terms, or the geometric nature of the bound- 




20 Lectures on Modern Mathematics 


ary, such as pseudoconvexity, plays an essential role. The 5-Neu- 
mann problem of Section 7 is a problem of this kind. 


6. EXISTENCE AND DEFORMATION OF COMPLEX 
STRUCTURE ON MANIFOLDS* 


We shall give a brief description of the use of differential equa¬ 
tions in some of the work of Kodaira, Spencer, and Kuranishi on 
deformation of complex structure on a compact manifold. Some 
of the questions they considered grew out of work by Frolicher and 
Nijenhuis. Throughout, V will be used to denote a compact 
manifold. In addition to the papers cited a useful reference is [17]. 

We recall that a manifold V (of even dimension 2n) is said to be a 
complex manifold if it is possible to cover V by coordinate patches in 
each of which the local 2 n real coordinates may be expressed as 
distinguished complex coordinates z 1 , . . . , z n in such way that 
the transformation between local coordinates of overlapping patches 
is expressed by holomorphic functions; for n = 1 such a manifold is 
simply a Riemann surface. The space of first-order forms with 
complex coefficients may then be decomposed into the two spaces: 
the forms spanned by dz l , . . . , dz n , called type (1, 0) and that 
spanned by the complex conjugates dz 1 , . . . , dz n , type (0, 1), 
these spaces being invariant under (holomorphic) change of coordi¬ 
nates. Furthermore, the following space of forms, called type 
( v> $), is invariant; forms spanned by exterior products of p forms 
of type (1, 0) and q forms of type (0, 1). The exterior differentia¬ 
tion operator d then decomposes naturally into d — d + 5 where 


d(adz ri A * • * A dz r p A dz* 1 A * • * A dz s «) 



k 

d and 5 map forms of type (p, q) into type (p + 1 , q) and (p, q + 1 ) 
respectively, they satisfy d 2 = d 2 = 0 and dd + 56 — 0. 

* The presentation here is based on several conversations with Prof. M. Kura¬ 
nishi to whom the author wishes to express his warm thanks. 



Partial Differential Equations; Applications in Geometry 21 


On a complex manifold we may introduce a real analytic Rieman- 
nian metric which is also Hermitian, that is, which in terms of dis¬ 
tinguished holomorphic coordinates is given by g# dz J dz k (using 
summation convention). Such a metric may be used to define, in 
the usual way, an L 2 scalar product ( , ) for forms of the same 

type, and, correspondingly the formal adjoint of S by (00, = 

(0 fy) ; d sends forms of type (p, q) into (p, q - 1) and # 2 = 0. We 
readily verify that the operator 

(6.1) n = 5 * + #5 

preserves type, commutes with 5 and and is a (strongly) elliptic 
partial differential operator on forms of any fixed type. 

A given manifold F may have different complex analytic struc¬ 
tures put on it which are not equivalent. Letting V h V 2 denote F 
with two analytic structures on it, we say that Fi and V 2 are equiv¬ 
alent if there is a diffeomorphism of V x onto V 2 which is also 
holomorphic with respect to the analytic structures. Since Rie- 
mann, people have studied deformation of the complex structure on 
a Riemann surface; the (complex) parameters on which the deforma¬ 
tions depend are called the moduli, and Riemann found the number 
of moduli of a compact surface of genus p to be the dimension of the 
Lie group of complex analytic automorphisms of the surface plus 
3p — 3. In recent years, after Teichmuller's work on extremal 
quasiconformal mappings, the global study of the space of moduli 
which, after some normalization, is called the Teichmuller space, 
has been carried out by various authors—Rauch, Ahlfors, Bers, 
Weil, and others. We refer the reader to [4] and [6] for a description 
of this work and references. 

The problems treated in Kodaira and Spencer [28], and Kuranishi 
[32] (and a paper to appear), which we shall discuss, are the 
following: 

1 . Given a complex analytic manifold F 0 , can we construct a 
nontrivial family V t of complex structures on F 0 (with F as basic 
underlying (7°° structure) depending continuously on a finite number 
of parameters t in a set M in a Euclidean £-space T } with F 0 cor¬ 
responding to t = 0? Can we take I to be a subvariety and 
require the dependence on t to be C°° (or even analytic) ? We say 
that the family V t is centered at Fo. 

Consider another family V 8 centered at Fo depending on param- 




24 Lectures on Modem Mathematics 


an L 2 scalar product on $ 0, 3 , which we denote again by ( , ). 

The operator 5 sends 3> 0,9 into $ 0,<z+1 , and its formal adjoint we 
denote again by $; the operator □ = $5 + 5$ is again seen to be 
strongly elliptic, and we may therefore obtain the analogue of the 
Hodge-Kodaira decomposition theorem: There is an operator G , the 
Green’s operator, commuting with 5 and $, such that the identity 
operator may be decomposed into 

(6.7) / - H + #5G+ 5#G 

where H is the orthogonal projection (in terms of our scalar product) 
into the harmonic space, that is, the space of forms annihilated by 
□. This space has finite dimension, say m ff , which, it should be 
noted, is independent of the choice of Hermitian metric. Thus 
any 0 £ <£ 0, 9 admits the unique orthogonal decompositions into 

(6.8) 0 =* + $0i + <502 

Note that □0 = O*=>50 = O and $0 = 0, by Green’s theorem. 

We shall show, following Kuranishi [32] (and an unpublished 
paper) that the set of solutions go G $ 0,1 of a larger system 

3o) — [co, «] 


near a> = 0 may be described in terms of m 1 complex parameters 
t = (fi, . . . , t m 1 ) lying on a complex analytic set, and that this 
set yields a locally complete family V t of complex structures. 

We observe that the linearized problem 

5co = 0, $w = 0 


has an ntx (complex) dimensional space of solutions, namely the 
elements of H$ 0 ' \ This suggests that the nonlinear problem should 
also have a finite parameter family of solutions near a> — 0 which 
might be obtainable by a bifurcation process (see, for example, 
Section 7.2 in [44]). 

Any solution of (6.9) satisfies, in virtue of (6.7), 
co = Hco + co] 

By a slight modification of the standard bifurcation process we shall 
solve the nonlinear equation 

( 6 . 10 ) 


CO = COO H - $f?[co, co] 




Partial Differential Equations; Applications in Geometry 25 


where wo is an arbitrarily prescribed element of H$ 0, 2 . If 0i, . . . , 

mi 

<t> mi is a fixed basis for H$°- 1 , we may set w 0 = ^ trfj with the t, 

1 

complex numbers. For coo having small norm, where the norm is 
yet to be chosen, we show with the aid of the implicit function 
theorem that (6.10) has a unique solution co(coo) near co = 0, which 
then necessarily depends holomorphically on co 0 , that is, on the 
parameters t. We then require that co(co 0 ) satisfy also (6.9). This 
requirement gives rise to the “bifurcation equation” which reduces 
to a finite number of analytic relations on the co 0 , that is, on the 
parameters t —hence the analytic set in the t space. The set of 
solutions thus obtained contains therefore all solutions of (6.9) with 
small norm. 

We must first choose a suitable normed space in which to work so 
that the implicit function theorem may be applied to (6.9). The 
appropriate space is furnished by the estimates of the previous sec¬ 
tion. We may either choose the space of forms with coefficients 
C k+a , that is, having derivatives up to order k which satisfy Holder 
conditions with exponent a (in some fixed coordinate patches cover¬ 
ing V) as in [27] and [32], or else the space of forms with coefficients 
in H s for s sufficiently large, the appropriate norms || ||t +a or j] J| 4 

are then easily globally defined and we verify that, say, 

(6.11) ||[<r,r]||,_ 1 < constant ||<r||«||r||« 
and, in virtue of the estimates of the previous section, 

(6.12) < constant ||</>||,_i 

We then easily obtain a unique small solution <o(co 0 ) of (6.10) if 
Ilcooll, is small by, say, iterations 

« (r+1) = <o 0 + «*?[<o (r \ w w ], w (1 > = <o 0 

<o(wo(0) is holomorphic in t. 

We observe the (d/a< y )<o(co 0 (0)|«=o = <f>s ** 0, that is, the mapping 
of the tangent space of the t space at the origin into //4’ 1 ■ 0 is injective. 

To obtain a solution of (6.9) suppose that w is the solution of 
(6.10); operating with 5 we find 


5o> — 5$(?[co, co] = I l(?f<o, co] — 5(r[<0, co] 


26 Lectures on Modem Mathematics 


or, by (6.7), 

(6.13) — 5co + [w, co] = H[co f a>] + $(?<5[cd, co] 

Thus (6.9) is satisfied if and only if the right-hand side is zero; the 
terms on the right being mutually orthogonal this means that they 
both vanish: 

(6.14) ff[«(« 0 ), co(coo)] = 0 

(6.15) #G<5[o>(a>o), w(coo)] = 0 

Observe that H is the projection on the m 2 dimensional space of 
harmonic elements of 3> 0,2 . Thus equation (6.14) involves a finite 
number of analytic scalar relations, and these determine a complex 
analytic set M in the 2-space. The other relation (6.15) seems to 
involve an infinite number of relations. It is a consequence of (6.4) 
to (6.6) however that (6.14) implies (6.15). To see this observe 
that from (6.4) and (6.5), 

#tr<3[co, co] = 2#(?[5co, co] 

Substituting the expression for 5co given by (6.13), with H[co f 00 ] = 0, 
we find 

$G5[ co, co] = 2$(?[[co, co], co] — 2 #(t[$(t5[co, co], co] 

= —2#(?[#Cr5[co, co], co] 

by (6.6). It follows then from (6.11) and (6.12) that 

||*W[«, «]||. < constant ||#G5[<o, co]||, * ||co||. 

and hence, if ||co||, is sufficiently small (which is the case if ||coo|| is 
small), it follows that iK?d[co, co] = 0. 

We thus have determined a family V t of complex structures (with 
t in a complex analytic set) satisfying (1). (In [27] this family was 
constructed under the assumption that ff$ (0,2) vanishes, so that 
(6.14) and (6.15) then hold automatically for all 2, and V t is a full 
complex parameter family. For Riemann surfaces, » = 1, this is 
trivially the case; in fact the whole problem is linear.) 

Kodaira and Spencer [28] have given examples showing that 
V t may be equivalent to V with t' t. 

We have yet to show that V t satisfies the conditions of (2), 
that is, is locally complete. This is proved in [32] but follows most 
easily from a recent theorem of Kuranishi (to appear). 




Partial Differential Equations; Applications in Geometry 27 


Theorem 3. Let 0 be a solution of (6.3) with small norm. Then 
it is similar to a solution of (6.9), that is, there is a diffeomorphism 
Q of V onto itself which is close to the identity such that the com¬ 
position <f> o g satisfies &(<t> ° g) = 0. Furthermore, g can be made 
unique as follows; g can be constructed by exponentiation (via 
geodesics) of a vector field which is orthogonal to the holomorphic 
vector fields. In particular if 0(s) is a family of solutions of (6.3), 
the resulting diffeomorphisms g(s) depend regularly on $. 

This theorem, which is proved in turn with the aid of the implicit 
function theorem^ immediately yields the desired properties of our 
family V r , for if V 8 is a family of complex structure centered at V o, 
the corresponding almost complex structures are described by ele¬ 
ments 0($) in $ ((U) which are solutions of (6.3). According to the 
theorem there is a diffeomorphism g(s ) such that 0(s) o g( s ) satisfies 
(6.9) for each $; but then it must be a unique one of our oo(o>o(0) 
since these contain all solutions near co = 0. Hence there is a well- 
defined mapping t = /(s) such that 

0(s) ° g(s) = w(/(s)) 
and this is the desired property. 

We recall that for t small the vector (0, 1) form w(t) = co(co 0 (0) is 
holomorphic on the analytic set M. Suppose now we consider in 
general a family V\ as in (1) depending C°° on some complex param¬ 
eters t in a neighborhood M of the origin in some Euclidean i-space. 
We say that the resulting family is complex analytic, and that M 
has a natural complex analytic structure if we can introduce new 
coordinates on M and find a (7°° family of diffeomorphisms g(t) of V 
onto itself so that the almost complex structure forms c o(t) o g(t) 
depend holomorphically on these new coordinates. In [29] Kodaira 
and Spencer also take up the question: 

3. When can one assert that a given family Vt is complex 
analytic? 

Kodaira and Spencer proved that certain conditions are sufficient. 
We shall now describe these. Consider a family g(t) of diffeo¬ 
morphisms of V. If we take the derivative of w(t) with respect to 
the real or imaginary part of one of the t variables at t = 0, we 
obtain a vector (0, 1) form 0 which because of (6.3) satisfies 50 = 0. 
If, however, we take the derivative of a>(t) o g(t) f we obtain, as a 
simple calculation shows, 0 + 5£ in place of 0 where £ is a vector 





28 Lectures on Modern Mathematics 


field. On considering all possible families of diffeomorphisms g(t) 
we can obtain all possible vector fields £. Thus choosing a vector in 
the tangent space to M at 0, (Tm)o, determines uniquely an element 
4> mod <5 (vector field) with 8$ — 0; this means that it determines 
uniquely a harmonic element x// (for a unique choice of £ makes 
<t> + d£ harmonic). Denote by po the map (Tm)q ~> H& 0, 
Clearly for every value of t there is a similarly well-defined map p t : 
( Tm)i —► Hfit ' 1 (with obvious notation). 

The assumptions under which Kodaira and Spencer show that a 
given family V t is complex analytic are, for each t: 

1. The dimension of H t $®' 1 is independent of t . 

2. The map p t is injective. 

3. The image p t ((TM)t) is a complex subspace of H t $* 0,1} . 

These assumptions are independent of the particular Hermitian 
metric introduced on Fo. 

The proof involves construction of an almost complex structure 
on M (in a fairly obvious way) and then showing that it satisfies 
the integrability conditions. The proof of integrability involves 
further use of elliptic equation theory. A simplified version of the 
proof has been given in [17]. 

7. ^-NEUMANN PROBLEM 

In the previous section, we considered the operator <5 acting on 
forms on a compact complex analytic manifold and also its adjoint 
which is defined once a Hermitian metric is chosen. In terms of 
the associated elliptic operator □ = d& + $8, having^, finite dimen¬ 
sional null space, we had the Hodge, Kodaira decomposition 
theorem. The ^-Neumann problem is concerned with the same 
operators acting on forms defined on an open subset M of V with 
compact closure, and satisfying certain boundary conditions (like 
the Neumann conditions for the classical Laplace equation). We 
wish to obtain an analogue of the decomposition theorem. For a 
real manifold with the exterior differentiation operator d in place of 
8 the corresponding problem has been treated by several authors 
(see [31] for complete references). For that case the boundary 
conditions satisfy the coerciveness conditions of Section 5. But 
for 8 the boundary conditions are not coercive and this makes the 
problem much more difficult and interesting. 



Partial Differential Equations; Applications in Geometry 29 


The most difficult part of the problem is the proof of regularity at 
the boundary of “generalized” solutions. If the boundary condi¬ 
tions are not coercive it is known that the lower-order terms may 
affect the situation (see [21]). We may illustrate this in the case of 
one complex variable: consider the boundary value problem for a 
complex valued function w of a complex variable z in a disc 

(7.1) w zz + cw = / in \z\ <1, w z = 0 on boundary 

with c a smooth function. (For simplicity we shall suppose that 
the solution we consider belongs to C 2 in \z\ < 1 but all remarks 
are easily extended to generalized, or weak, solutions of the prob¬ 
lem.) The boundary condition is not coercive, indeed if c * 0, the 
homogeneous problem has an infinite dimensional space of solutions, 
namely analytic functions. Such solutions need not of course be 
regular at the boundary. If however c 9 ^ 0 at a boundary point z 0 , 
it is very easy to prove regularity of the solution at the boundary 
near z 0 (regularity in the interior is a consequence of the ellipticity 
of the equation). Dividing the equation by c and differentiating 
with respect to f, we obtain the following equation for w z : 

o~ l w zzz + ( c~ l ) z w 2z + Wz = (c“V)ii w * = 0 on boundary 

Regarded as a problem for w z the boundary condition w z = 0 is of 
the coercive type (in fact it is the Dirichlet condition), and the well- 
known local regularity theorems for such imply that w z is regular in 
some neighborhood of z 0 . Since c ^ 0 near z 0 we have, from (7.1), 
w = c~ 1 (f + w zz ), and hence w is also regular there. 

In the 5-Neumann problem it is the shape of the boundary, the 
strong pseudoconvexity (in the sense of E. E. Levi), that plays a 
role, rather than the lower-order terms. 

Strong Pseudoconvexity. In terms of local analytic coordinates 
z — (z 1 , , . . , z n )j suppose that the (smooth) boundary through zq 
is expressed by f(z) = 0 where/is a smooth real function with grad 
f{z 0 ) = 0, with / < 0 in M and / > 0 outside M. The quadratic 
form f z ’ z *a j d k is to be positive definite for all a = (a 1 , ... , a n ) 
satisfying f z ia x = 0 (using summation convention). It is easily 
verified that this property is independent of the particular local 
coordinate or of the function/ representing the boundary. 

We now describe the boundary conditions. Observe first that if 
<f> and are, respectively, forms of type (p, q) and (p, q + 1) which 



SO Lectures on Modem Mathematics 


are CT in the closure of M , then, in terms of the L 2 scalar product, 
( 8 <t> } yp) — (<t> f &\j/) equals a boundary integral which for fixed 
vanishes for all such <f> if and only if vanishes. At each boundary 
point the space of forms admits a unique orthogonal decomposition, 
depending on the metric, into the space of forms having dr as a 
factor and its orthogonal complement; here, for points near the 
boundary, r represents the geodetic distance to the boundary 
measured as positive outside and negative inside M. In this 
decomposition we denote the part of the form ^ having 5 r as a 
factor by i*f/. 

Boundary Conditions . The 5-Neumann boundary conditions for a 
form <j> are the following. 

(7.2) v<t> = 0, v5<l> = 0 on boundary bM 

We say that <j> is a harmonic form associated with the 5-Neumann 
problem if it satisfies these boundary conditions and is a solution of 
5</> — 0, #<f> = 0. d p * q will denote the forms of type (p, q) which 
are C°° in the closure of Af, &> p,q the subset satisfying (7.2), and 
the smaller subset satisfying also 5 <i> = &$ = 0. 

Kohn, [30, 31], has recently obtained an analogue of the Hodge- 
Kodaira decomposition theorem for the 5-Neumann problem and 
used it to solve several interesting problems. The work is based on 
a certain L 2 inequality (7.4) which was proved by Morrey [40] for 
(0, 1) forms on small tubular (complex) neighborhoods of real 
analytic manifolds; Morrey treated the 5-Neumann problem in such 
a region in connection with his work on analytic embedding of real 
analytic manifolds. The inequality is an analogue of the L 2 coercive 
inequalities for forms £ Ct p ' q , with q > 0, in a strongly pseudo- 
convex domain M satisfying v<f> = 0 on the boundary bM. In 
describing it we introduce the following (noninvariant) norm 

(7.3) E(<t>) 2 =* (<f>, <f>) + j bM |0| 2 ds + M\- z 2 

where ds represents volume element on the boundary, and |[^||g 2 is 
defined by means of a fixed covering of the closure of M by a finite 
number of holomorphic coordinate patches fly, with local coordinates 
Zj, as the sum (over j) of the integrals in fly of the sum of squares of 
all first derivatives with respect to z/, , zj n of all components 

of 



Partial Differential Equations; Applications in Geometry 31 

Basic Inequality . For such forms 4 > with q > 0 there is a constant 
C such that 

j ( 7 . 4 ) E(<f>) 2 < C[( 5 <f>, 84) + (fit, 04) + (4, 4 )] = CD{4, 4) 

We remark that if q = 0 , in which case d<f> = 0 , the inequality 
cannot hold. We can see this immediately if p also vanishes, for 
here the inequality would imply that it is possible to estimate E{ 4) 2 
in terms of (4, 4) = ||<£|| 2 for any holomorphic function 4 —which 
is clearly not possible. 

An important property of the norm E(<j >) is the fact that it is 
completely continuous with respect to the L 2 norm, in the sense 
that if is a sequence of forms in d p> 4 having uniformly bounded 
norms E(4j), a subsequence converges in the L 2 norm || ||. 

Before describing Kohn’s result let us pursue a natural line of 
reasoning suggested by the inequality (7.4) leading to a decomposi¬ 
tion theorem in L 2 . This is to form the completion, X = X Pi 4 , in 
the norm 4 ,) of forms in a p - 4 satisfying v</> = 0 on bM, and to 
define a “generalized” solution 4 , in 3C of the boundary value 
problem 

(7.5) CD0 + a4> — a, v 4 > — 0, vd 4 > = 0 on bM 

given a £ C°° in the closure of M, as an element 4 > £ X p - 4 satisfying 

(7.6) (5<i>, 34/) + ($<£, 34) + {a4>, 4/) = («, 4) for all 4 €r X Pi 9 

Observe that the boundary condition v34> = 0 is obtained as a 
“natural” boundary condition; if 4 , is regular in M + bM then 
(7.5) and (7.6) are equivalent. In particular the space X p,t of 
generalized harmonic solutions of the 5-Neumann problem is defined 
as those functions in X p • 4 , satisfying 

(84, 34 ) + (04,114) = 0 for all 4 £ x p - 4 

As a consequence of the basic inequality (7.4) and the complete 
continuity of the norm E(4), we see that for q > 0 the spaces X p ‘ 4 , 
and a fortiori also 3C Pl 4 , are finite dimensional. More generally if 
3 > 0 the space of solutions of the homogeneous equation ( 7 . 6 ), 
a = 0, is finite dimensional. The space X p • 0 is in general infinite 
dimensional. Another consequence is the fact that, for q > 0 , the 
set of (p, q) forms a £ L 2 for which (7.5) is solvable in the general¬ 
ized sense (7.6) is a closed subspace of L 2 , the range of the “general¬ 
ized” operator □ + a. This is derived in a straightforward way 



$8 Lectures on Modem Mathematics 


using the closed graph theorem. Kohn proves in addition by a 
rather simple argument that the same is true also for q = 0 despite 
the lack of complete continuity in this case. 

Consider now elements ^ £ 3C P ' 9 © 3C P * g , q > 0, that is, orthog¬ 
onal (in L 2 ) to 3C P| 9 . Then there is a constant C\ such that 

(7.4)' |tyj| 2 < Ci[ W, ty) + (tty, tty)] for all * £ X p ' 9 © 3C P ’ 9 

(this follows easily by an indirect argument using the complete 
continuity of We see that if <t> is a solution of (7.6) with 

a — 0 then (a, \p) vanishes for ^ £ 3Q, P ' 9 f that is, a £ L 2 © X. 
From (7.4)' we see conversely, with the aid of the representation 
theorem for linear functionals in Hilbert space, that for any a £ 
L 2 © X p * 9 ,q > 0, the equation (7.6) is solvable and there is a unique 
solution in X p ' 9 © X Pf 9 . Denote this by Not and define Na as zero 
for a £ X Pt 9 and extend by linearity to all a in L 2 . Thus if Ha 
denotes the projection (in L 2 ) of any (p, q) form a into X p,a , we 
obtain the following decomposition, for q > 0, 

(7.7) a — Ha + LUNa 

where □<£ is understood in the generalized sense as the element a 
such that (7.6) holds. 

This is the L 2 decomposition theorem for (p, q) forms, q > 0; N 
here is the Neumann function, analogous to the Green's function. 

The main part of Kohn's work, and the most difficult, is the proof 
that for q > 0 the operator N maps d p ’ 9 into d Pf 5 , that is, that the 
generalized solutions of (7.6) for a £ & p,q (i.e., of class (7°° in the 
closure of M) are themselves in d P) 9 } q > 0. This is proved by a 
local analysis at the boundary using techniques developed by 
Hormander [25] in connection with his treatment of regularity for 
the Cauchy problem (as Theorem 2, Section 2, Lecture 1), This 
involves proving first the regularity of tangential derivatives. The 
proof is too intricate to be described here. * 

* Simplified proofs of the regularity at the boundary have been obtained 
by C. B. Morrey, and more recently by J. J. Kohn and L. Nirenberg. L. 
Hormander has given a modified approach to the d-Neumann problem (to 
appear) which avoids the question of regularity at the boundary; it uses a 
weaker regularity theorem due to P. Lax and R. Phillips [34]. Although the 
decomposition theorem he obtains is not as strong as Kohn’s, it is sufficient for 
many applications to functions of several complex variables. 



Partial Differential Equations; Applications in Geometry S3 


Having the differentiability, we easily verify that 

(7.8) DAT = JVQ, 5N = Nd, and dN = Nd 

in the sense that (dNa, 0) = (da, N0) for a, 0 G 3C P - Q . Further¬ 
more, N is a compact operator of L 2 into L 2 for q > 0. 

Observe that 

(7.9) Ha = a — dN da for a £ X p, ° 

Kohn derives an analogous decomposition theorem, and operator N 
(which is not compact, however), also in case q = 0. Some addi¬ 
tional argument is required. 

At this point we would like to indicate how the main inequality 
(7.4) is proved. Since the operator □ is elliptic the estimate of all 
first derivatives in any compact subset in terms of D(<f>, <j>) follows 
from standard elliptic theory. Using a partition of unity we reduce 
the problem to that of establishing (7.4) for functions with support 
in some coordinate patch near a boundary point. For such a func¬ 
tion, (7.4) is derived by a careful use of Green’s theorem. We shall 
carry it out for the very special situation of (0, 1) forms in a strongly 
pseudoconvex domain in C n . For such a form <f> = <t>jdz j we have, 
using ,j and J to denote d/dz j and d/dz j , 

Ml 2 + Ml 2 = if M Y - *u|* + f M Y<h.m.k 

3 j t fc 

( 7 - 10 ) = JuY 1 ^* 1 2 + J M Y ~ 

j,k j,k 

The first integral on the right of (7.10) is ||<*>p 2 whereas the inte¬ 
grand in the second is a divergence expression and so is equal to a 
boundary integral 

C 7 - 11 ) Y IbM — ds 

where n k , n; are the (complex) components of the normal in the 
sense that n k and n% are proportional (with a positive factor b ) to 
f,k, fj where / is the function representing the boundary locally by 
/ = 0 . 

We now have to use the boundary condition n<f> = 0 which in 
this case is simply = 0 or = 0. On taking complex 

conjugate this means Snjcta so that the first term in (7.11) drops 


$4 Lectures on Modem Mathematics 


out. We may wonder how the second term can be estimated in 
terms of the square integral of 10J on the boundary, since it involves 
first derivatives of 0. The reason is that it may be expressed by a 
tangential derivative of v<j) modulo a quadratic in 0. In fact on the 
boundary we have (using summation convention) /,E0s = 0. This 
means that the operator !j>i(d/dz k ) operates tangentially on the 
boundary. Hence in particular we find on applying it to = 0, 

_ q 

0JS — k (/,y0j) =0 on bM 


or 

Thus we find that the integral in (7.11) equals 

Lm ds 

and according to strong pseudoconvexity the integral is a positive 
definite quadratic form. This completes the derivation of the 
inequality. 

The reason that pseudoconvexity does not play a role in the 
analogous problem on real manifolds with d in place of 5 is that 
(7.4) holds on any domain M (even without pseudoconvexity) with, 
in fact ||0||5 2 replaced by the integral of the sums of squares of all 
first derivatives of the components of 0, and hence we do not need 
the boundary integral analogous to (7.11) to be positive. We note, 
however, that if we consider 1-forms 20^ dx j in a strong pseudo- 
convex domain in R n (which is simply equivalent to strict con¬ 
vexity), the boundary integral is indeed positive, and we find, 
assuming n<j> = 0 on bM 

W*ll» + M’-2 j M 

y,k 

where the quadratic form in the boundary integral is positive 
definite. 

Kohn makes a number of applications of the decomposition 
theorem. In particular, he gives a new proof, along the lines sug¬ 
gested by Spencer, of the fact that an almost complex structure 
satisfying the integrability conditions corresponds to a complex 




Partial Differential Equations; Applications in Geometry $5 


structure. Although the proof is perhaps more involved than the 
one given in [41], it is interesting in that it works completely in the 
framework of linear theory in contrast to the proof in [41]. 

The idea of the proof is the following. Consider an almost com¬ 
plex structure given in some neighborhood of the origin in C n by 
forms as in Section 6: 


(7.12) ft = dz 3 + wtf dz k , / = 


We may choose these so that the vanish at the origin. Using 
these, one may again decompose the space of forms into forms of 
associated type (p, q) y namely forms spanned by exterior products 
of p of these and q of the complex conjugate forms, dz J + co% J dz k . 
Furthermore, we may decompose d into 3 + d, using the unique 
decomposition df = arf + bj<t> j by df = arf , 3/ = brf?. The 
integrability condition now takes the elegant form 3 2 = 0. 

Kohn remarks that the foregoing estimates and the decomposition 
theorem may be derived for such operators 5. 

Now replace the forms <j> j above by the forms 4> t j where w% j (z t z) 
is replaced by o)f(tz y tz) } t > 0. We verify easily that the <*>/ still 
satisfy the integrability conditions. Let d t be the corresponding 
operator; for t = 0 it corresponds to the 3 of the underlying complex 
structure in C n . If we take for M the unit ball, it is certainly 
strongly pseudoconvex for 5 0 and hence also for 5, for small L Let 
N t be the corresponding Neumann operator in the decomposition 
theorem and apply the decomposition (7.7) to the function z*. 
According to (7.9), 

zd = z 3 — # t N t d t z j 

satisfies 3 f V ~ 0. 

If for small t the derivatives of the term fit — & t N t d t z j are small at 
the origin then the derivatives of z t j are close to those of z j there, 
and hence the zj may be introduced as new coordinates in some 


neighborhood of the origin. 


The functions 7 = z t j 



are then 


the desired holomorphic coordinates near the origin for the original 
almost complex structure given by (7.12). 

To estimate the derivatives of j3 tf observe first that 


lift ii < lifted ii (ft - ftyu 

where denotes the norm of the operator i ) t N t as a mapping 

of the space of square integrable forms into itself. Since the spaces 


36 Lectures on Modem Mathematics 


3C P ' Q , q > 0, for do are null, the same is true for d t for small t 
(because the dimension of null spaces of elliptic operators depending 
on a parameter is upper semicontinuous in the parameter). Con¬ 
sequently we may show that the operator N t depends smoothly in t 
and hence the norms ||i^ 1V*|| are uniformly bounded for small t 
Since <3* — d t has small coefficients we see finally that ||/3*|| is small. 

It follows from estimates for elliptic equations applied to the 
equation 3 t fit — (dt ~ d)z 3 that then all derivatives of are small 
near the origin for small t —the desired result. 


8. INDEX OF ELLIPTIC OPERATORS 

We have observed in Section 5 that the index for an elliptic 
differential operator with coercive boundary conditions is finite. 
Recently there has been considerable interest in calculating the 
index for general problems (see Gelfand [18]). Previously the index 
had been evaluated for a variety of first- and second-order equations 
in the plane—starting with the so-called Riemann-Hilbert problem 
which is essentially the following: to determine a function u in the 
unit disc satisfying 

An = /, v grad u — <t> on boundary 

where v is a unit vector defined on the boundary as a function of arc 
length. The index of this problem is equal to 2(1 — n) where n is 
the rotation number of v , going around the circle counterclockwise. 
Problems in the plane are treated in great detail in the book [54] by 
Vekua, which also contains an extensive bibliography. 

For higher dimensions several authors, Agranovic, Dynin, 
Volpert, Seeley, Cordes, Koppelman, and others, have obtained 
results in special cases. We refer to [3] and [12] for other references. i 
Atiyah and Singer [5] have determined a general formula for the I 
index of an elliptic operator on any compact oriented manifold. 
Applying it to various particular differential operators arising 
geometrically, they obtain in a unified manner many known 
topological results and also new results—for example, the Hirze- 
bruch-Riemann-Roch theorem for compact manifolds, which had 
previously been proved only for projective algebraic manifolds. 
The formula for the index is expressed in terms of certain topological 
and analytic invariants associated with the manifold and the oper- 



Partial Differential Equations, Applications in Geometry 87 


ator. It is unfortunately too complicated to be explained here; we 
can merely make a few remarks about some of the analytic tech¬ 
niques that are involved. 

Let us first consider some special cases treated by Atiyah and 
Singer. Consider a compact Riemannian manifold V of dimension 
n = 2Z; let 8 denote the formal adjoint of the exterior differential 
operator d . On the space of all complex valued forms the operator 
d + 5 is elliptic, but also self-adjoint so its index is zero. But by 
considering d + 5 on a subspace it is possible to get index 5 * 0. 

1. If * is the usual isomorphism of p forms to n — p forms 
defined with the aid of the metric then (*) 2 = (-1)*. Let X x (X 2 ) 
denote the space of forms with even (odd) degree. Then d + h 
maps X\ into X 2 and restricted to X\ it is still elliptic, but no longer 
self-adjoint. Applying their formula for the index they obtain the 
following, which also follows easily from the Hodge theory and the 
Gauss-Bonnet theorem, 

2 1 

^ (— l) p dim H p = x(V ), the Euler characteristic 

p = 0 

Here H p is the space of harmonic p forms. 

2. By decomposing the space of forms in a different way into two 
spaces X\ X 2 with d + 5: Xi —^ X 2 , and calculating the index of 
d + 5, they obtain the so-called Hirzebruch index theorem. 

3. If V is a complex manifold with Hermitian metric and the 
operator <5 + # is applied to functions with values in a certain 
holomorphic vector bundle, they obtain the Hirzebruch-Riemann- 
Roch theorem. 

Turning now to the analytic techniques we remark that by intro¬ 
ducing suitable normed topologies it is possible to represent the 
elliptic operator as a bounded operator. Then since the index of an 
operator is unchanged under slight perturbation or by addition of a 
compact operator, it follows that the lower-order terms of the oper¬ 
ator, which act as a compact operator in comparison with the 
highest-order terms, do not affect the index. Thus the index of the 
operator depends only on the highest-order terms. Furthermore, if 
the operator is self-adjoint, the index is obviously zero. We men¬ 
tion also that the index of the product of two operators is the sum 
of their indices. 

The elliptic operators that they treat are for systems of functions, 





88 Lectures on Modern Mathematics 


and since these are to include forms, which cannot be expressed 
globally as a system of functions over the manifold, they consider 
elliptic operators operating on smooth sections of a complex vector 
bundle E over the manifold V and mapping into sections of another 
complex vector bundle F over V with the fibers having the same 
dimension N . Since the index of an elliptic operator is invariant 
under deformation of the operator, they seek to deform the operator 
to one composed of fairly standard ones. But it seems to be a 
difficult problem to determine the connected components of elliptic 
operators; instead they consider a more general class of operators 
including the elliptic ones as a very special case: singular integral 
operators whose symbol (represented locally as an N X N matrix 
defined on the cotangent bundle) does not vanish (i.e., has non-zero 
determinant). Singular integral operators have been extended by 
Seeley [50] to compact manifolds (for the scalar case, but it is also 
easily done in the vector bundle framework), and they have finite 
index if the symbol does not vanish (see Section 4). Furthermore, 
the index depends only on the symbol. The formula obtained in 
[5] is valid, more generally, for such “elliptic” singular integral 
operators. 

With suitable introduction of an equivalence on the set of all 
elliptic symbols (with the property that equivalent symbols have 
the same index), the equivalence classes form an abelian semigroup, 
and they reduce the calculation of the index to that for a very 
special elliptic operator, namely the operator d 8 acting in the 
decomposed spaces of example 2, tensored, however, with a suitable 
vector bundle W. To calculate the index for this they follow 
Hirzebruch, who carried it out for W = 1. A basic step is the proof 
that if V (of even dimension) is the boundary of a manifold Y with a 
vector bundle U on it, which at the boundary reduces to W, the 
index of the particular operator is zero. To do this they make use 
of an extension of the operator d + 8 (with the given decomposition 
“1“ X 2 ) into Y, and show that the resulting boundary value 
problems, prescribing the X\ component on the boundary, are the 
coercive type mentioned in Section 5, and that the X 2 boundary | 
component of the solution may be expressed by means of a singular 
integral operator T in terms of the given X\ boundary component; 
and the index of T is then shown to be zero. 

We cannot go into any detail here since the proofs use a great 
deal of topology as well as analysis. 



Pariial Differential Equations; Applications in Geometry 89 


REFERENCES 

1* S* Agmon, “The coerciveness problem for integro-differential forms,” J. 
Analyse Math. 6 (1958) 183-223. 

2. S. Agmon, A. Douglis, and L. Nirenberg, “Estimates near the boundary 
for solutions of elliptic partial differential equations satisfying general 
boundary conditions,” I, II, Comm. Pure Appl. Math. 12 (1959) 623-727: 
17 (1964) 35-92. 

3. M. S. Agranovic and A. D. Dynin, “General boundary value problems for 
elliptic systems in higher dimensional regions,” Dokl. Akad. Nauk. SSSR 
146 (1962) 511-514. Soviet Mathematics 3 (1962), 1323-1327. 

4. L. V. Ahlfors, “Teichmuller spaces,” Proc. Intern . Congr. Math. Stockholm 
(1962). 

5. M. F. Atiyah and I. M. Singer, “The index of elliptic operators on compact 
manifolds' J (mimeographed). 

6. L. Bers, “Spaces of Riemann surfaces,” Proc. Intern. Congr. Math. Edin¬ 
burgh (1958) 14-21. 

7. F. Browder, “A priori estimates for solutions of elliptic boundary value 
problems,” I, II, Indag . Math. 22 (1960) 145-159, 160-169. 

8. A. P. Calderon, “Uniqueness in the Cauchy problem for partial differential 
equations,” Amer . J. Math. 80 (1958) 16-36. 

9. A. P. Calderon, “Integrales singulares y sus aplicaciones a ecuaciones 
diferenciales hiperbolicas,” Univ. Buenos Aires, Fac. Ciencias Exactas y 
Nat. Dep. de Matem. (1960). 

10. A. P. Calderon and A. Zygmund, “Singular integral operators and differ¬ 
ential equations,” Amer . J. Math. 79 (1957) 901-921. 

11. P. Cohen, “The non-uniqueness of the Cauchy problem,” Technical 
Report No. 93, Stanford University (1960). 

12. H. O. Cordes, Lecture notes at University of California, Berkeley 
(1962). 

13. A. Douglis and L. Nirenberg, “Interior estimates for elliptic systems of 
partial differential equations,” Comm. Pure Appl. Math. 8 (1955) 503-538. 

14. D. G. Figueiredo, “The coerciveness problem for forms over vector valued 
functions,” Comm. Pure Appl. Math. 16 (1963). 

15. A. Friedman, Generalized functions and partial differential equations , Engle¬ 
wood Cliffs, N.J. (1963) Prentice-Hall. 

16. K. O. Friedrichs, “Symmetric positive linear differential equations,” 
Comm . Pure Appl. Math. 11 (1958) 333-418. 

17. A. Frolicher, E. T. Kobayashi, and A. Nijenhuis, “Deformation theory 
of complex manifolds,” Technical Report No. 10, 1959, University of 
Washington. 

18. I. M. Gelfand, “On elliptic equations,” Uspekhi Mat. Nauk 15 (1960) 121; 
Russian Math. Surveys 15 (1960) No. 3, 113-123. 

19. I. M. Gelfand and G. E. Silov, Generalized functions: Vol. 1, Generalized 
functions and operations on them; Vol. 2, Spaces of* fundamental functions 
and generalized functions; Vol. 3, Some questions in the theory of differential 
equations , Moscow (1958-1959). 



40 Lectures on Modem Mathematics 


20. L. Hormander, “On the theory of general partial differential operators,” 
Acta Math . 94 (1955) 161-248. 

21. L. Hormander, “On the regularity of the solutions of boundary problems,” 
Acta Math . 99 (1958) 225-264. 

22. L. Hormander, “On the uniqueness of the Cauchy problem,” I, II, Math. 
Scand. 6 (1958) 213-225; 7 (1959) 177-190. 

23. L. Hormander, “Differential operators of principal type. Differential 
equations without solutions,” Math. Ann. 140 (1960) 124-146 and 169-173. 

24. Hormander, On the range of differential and convolution operators , Institute 
for Advanced Study, Princeton, (mimeographed) April 1961. 

25. L. Hormander, Linear partial differential operators, Berlin (1963) Springer. 

26. F. John, “Continuous dependence on data for solutions of partial differ¬ 
ential equations with a prescribed bound,” Comm. Pure Appl. Math 13 
(1960) 551-585. 

27. K. Kodaira, L. Nirenberg, and D. C. Spencer, “On the existence of defor¬ 
mations of complex analytic structures,” Ann. of Math. 68 (1958) 450-459. 

28. K. Kodaira and D. C. Spencer, “On deformations of complex analytic 
structures,” I, II, Ann. of Math. 67 (1958) 328-466; III, Stability theorems 
for complex structures, Ann. of Math. 71 (1960) 43-76. 

29. K. Kodaira and D. C. Spencer, “Existence of complex structure on a differ¬ 
entiable family of deformations of compact complex manifolds,” Ann of 
Math. 70 (1959) 145-166. 

30. J. J. Kohn, “Solution of the d-Neumann problem on strongly pseudo- 
convex manifolds. Regularity at the boundary of the d-Neumann prob¬ 
lem,” Proc. Nat. Acad. Sci. 47 (1961) 1198-1202; 49 (1963) 206-213. 

31. J. J. Kohn, “Harmonic integrals on strongly pseudo-convex manifolds,” I, 
II, Ann. of Math, (to appear). 

32. M. Kuranishi, “On the locally complete families of complex analytic 
structures,” Ann. of Math. 75 (1962) 536-577. 

33. P. D. Lax, “The L<z operator calculus of Mihlin, Calderon and Zygmund” 
(mimeographed lecture), New York University (1963). 

34. P. D. Lax and R. S. Phillips, “Local boundary conditions for dissipative 
symmetric linear differential operators,” Comm. Pure Appl. Math. 13 
(1960) 427-455. 

35. H. Lewy, “An example of a smooth linear partial differential equation 
without solution,” Ann. of Math. (2) 66 (1957) 155-158. 

36. J. L. Lions and E, Magenes, “Problemi ai limiti non omogenei. Probl&mes 
aux limites non homog&ies,” I, III, IV, V, Ann . Scuola Norm. Sup . Pisa 14 
(1960) 269-308; 15 (1961) 39-101; 15 (1961) 311-326; 16 (1962) 1-44; II, 
Ann. Inst. Fourier 11 (1961) 137-178. 

37. B. Malgrange, “Existence et approximation des solutions des Equations 
aux d^riv^es partielles et des Equations de convolution,” Ann. Inst. Fourier , 
Grenoble 6 (1955-1956) 271-355. 

38. B. Malgrange, “Sur la propagation de la r£gularit£ des solutions des 
Equations & coefficients constants,” Bull. Math. Soc. Math. Phys., Roumanie, 
to appear. 

39. S. G. Mihlin, “Singular integral equations?,” Uspekhi Mat. Nauk USSR 
NS 3 (25) (1948) 29-112. 



Partial Differential Equations; Applications in Geometry 41 


40. C. B. Morrey, “The analytic embedding of abstract real manifolds/* Ann. 
of Math. 68 (1958) 159-201. 

41. A. Newlander and L. Nirenberg, “Complex analytic coordinates in almost 
complex manifolds/’ Ann. of Math. 66 (1957) 391-404. 

42. A. Nijenhuis and W. B. Woolf, “Some integration problems in almost- 
complex and complex manifolds/’ Ann. of Math., 77 (1963) 424-489. 

43. L. Nirenberg, A complex Frobenius theorem, Seminars on Analytic Functions, 
Institute for Advanced Study, Princeton (1957) 172-189. 

44. L. Nirenberg, “Functional analysis,” Lecture notes. New York University, 
Courant Institute of Mathematical Sciences (1961). 

45. L. Nirenberg, “Some aspects of linear and nonlinear partial differential 
equations,” Proc. Intern. Congr. Math., Stockholm (1962). 

46. L. Nirenberg and F. Treves, “Solvability of a first order linear partial 
differential equation,” Comm. Pure Appl. Math., 16 (1963) 331-351. 

47. J. Peetre, “Elliptic partial differential equations of higher order,” Lecture 
notes, University of Maryland, Institute for Fluid Dynamics and Applied 
Mathematics, 1962. 

48. L. Sarason, “On weak and strong solutions of boundary value problems,” 
Comm. Pure Appl. Math. 16 (1962) 237-288. 

49. M. Schechter, “On L p estimates and regularity I,” Amer. J. Math., 85 
(1963) X-13. 

50. R. T. Seeley, “Singular integral operators on compact manifolds,” Amer. J. 
Math. 81 (1959) 658-690. 

51. Seminaire Schwartz, Uniciti du probleme de Cauchy, University of Paris 
(1959-1960). 

52. F. Treves, “Relations de domination entre opdrateurs diff^rentiels,” Acta 
Math. 101 (1959) 1-139. 

53. F. Treves, “Lectures on linear partial differential equations with constant 
coefficients,” Lecture Notes No. 27, Inst, de Mat. Pura e Aplicada de 
Conselho Nacional de Pesquisas, Rio de Janeiro (1961). 

54. I. N. Vekua, Generalized analytic functions, Fizmatgiz. Moscow (1959). 
English translation New York (1962) Pergamon Press. 

55. M. Zerner, “Solutions de liquation des ondes pr^sentant des singularity 
sur une droite,” C. R. Acad . Sci. Paris 260 (1960) 2980-2982. 


2 

Generators and 
Relations in Groups 
—The Burnside Problem 

Marshall Hall } Jr. 


1. INTRODUCTION 

Many of the natural problems on generators and relations in 
groups are difficult and some are impossible. Given a group G 
generated by a finite number of elements subject to a finite number 
of relations, the problem of deciding whether or not a particular 
word in the generators is the identity is known as the “word prob¬ 
lem.” The word problem is impossible in that it is recursively 
unsolvable, as was shown first by Novikov in 1955. From this it 
follows that a host of other reasonable sounding problems are also 
unsol vable. 

The unsolvability of these problems in their most general form 
invites us to the consideration of special cases where we may solve 
the problems. The most natural way to consider problems on 
generators and relations in groups is to consider the groups as pre¬ 
sented as factor groups of free groups. Sections 2 through 6 give a 
brief discussion of some of the main results. Section 6 gives a 
sketch of a method of Tartakovskii, which yields a solution of the 
word problem under suitable restrictions. 

The monograph Generators and Relations for Discrete Groups , by 
Coxeter and Moser [8] contains a wealth of material and a large 
number of particular examples which the reader will find most 
4* 




Generators and Relations in Groups—The Burnside Problem 43 

rewarding. It has been possible to include only a very little of this 
here. 

The particular problem which is treated here with some degree of 
thoroughness is the Burnside problem. The present formulation 
of this problem is the question, “If G is a group generated by a 
finite number of elements with the property that there is a positive 
integer n such that x n = 1 for every x in G, is G necessarily finite?” 
In his original paper Burnside (7) formulated this as a question and 
not as a conjecture, although some writers speak of “Burnside’s 
conjecture.” The groups G are known to be finite if n = 2, 3, 4, 
or 6, and Novikov announced in 1959 that the groups are infinite 
for n > 72. But Novikov’s proof has not yet appeared, and it 
may be that since his announcement he has found a gap in his 
proof. Sections 6, 7, and 8 deal with the Burnside problem. Sec¬ 
tion 6 deals with the elementary and direct methods applicable to 
n = 2, 3, 4. Section 7 discusses the application of the theory of 
p-length developed by P. Hall and G. Higman [19]. Section 8 
describes the use of commutators and associated Lie rings in attack¬ 
ing the restricted Burnside problem and in particular Kostrikin’s 
solution of the restricted Burnside Problem for prime exponents. 


2. PRESENTATION OF GROUPS 

Suppose a group G is generated by elements x and y satisfying 
relations x 2 = y 2 = (xy) 3 = 1 . Here G is finite of order 6, The 
distinct elements of G are 1 , z, y , xy , yz, xyx. In this simple case let 
us consider the meaning of the statements and the questions they 
raise. 

1. “G is generated by elements x and y” This means that x and 
y are elements of G, and that every element of G has at least one 
expression as a finite product of the elements x, x~ l , y } and y~\ 

2. “The distinct elements of G are 1, x> y } xy , yx, xyx” This 
means that every element of G can be put into one of these six forms 
as a consequence of the relations x 2 = y 2 = (xy) 3 = 1, and that 
there is no self-contradiction in these relations or any further 
equalities consequences of these relations. 

Since x 2 = 1, then x~ 1 (x 2 ) = x~ l or ( x~ 1 x)x = x~ x or x = x~\ 
Similarly from y 2 — 1 follows y = Hence a finite product of 

x, x ~*i Vs V ~~ 1 can be written as a finite product of x and y. In a 
finite product g = a\a<i • % • ai^iaia,i + iai+ 2 • • • a ty where for 


44 Lectures on Modem Mathematics 


j = 1 , . . . | t either aj = x or a$ = y, if two consecutive a’s are 
equal say a z = a i+ 1 = x or a* = a i+ 1 = y, since x 2 = y 2 = l,we 
haveaia i+ i — 1 , and we may rewrite g = ai • • * a t '_ia t - +2 * * • a t . 
Thus g is either the identity or an alternating product of z’s and y’s. 
Also from (xy) z = 1 we find xyxyxy = 1 , yxy = x~ 1 y~ 1 x~ 1 = zyx. 
From this we see that every g can be put in one of the forms 1 , x y y, 
xy, yx, xyxj and so G has at most six elements. But if we consider 

( 1 2 3\ /l 2 3\ 

1 3 2 J Y = V2 1 3 )’ We find that 

X 2 = F 2 = (IF ) 3 = 1 and these permutations generate the sym¬ 
metric group on three letters, and in this the six elements 1, X, Y, 
IF, YX, XYX are all different. Here mapping x into X and y 
into F we have what we call a realization of the group G. This 
realization shows that the relations are not self-contradictory, and 
that no further equalities are consequences of the relations. 

We may easily dispose of one natural question which arises here. 
When are defining relations for a group G consistent? The answer 
is that any set of defining relations is consistent. For if G is gener¬ 
ated by elements Xi and relations fj(x) = 1 , then, if every X( = 1 , 
every relation fj(x) =? 1 is satisfied. Here G is the identity and is a 
realization of all the relations. The group G consisting of the 
identity alone is thus a realization of every group given by gener¬ 
ators and relations, and this we call the trivial realization. If the 
only realization of a group G with generators and relations is the 
trivial one we say that G is trivial. We have already reached an 
unsolvable problem. No effective procedure exists to determine 
whether or not a group G given by generators and relations is 
trivial. The group G generated by the three elements x, y, z satisfy¬ 
ing relations 

( 2 . 1 ) x~ l yx = y 2 , y~~ l zy = z 2 , z~ l xz = x 2 

is trivial, but the reader may find that it is not “trivial” to prove 
this. 

What may be said about a group G with generators but no rela¬ 
tions? It is conceivable that in every group with say r generators 
there is some nontrivial combination which is the identity, perhaps 
the same one for all such groups, perhaps a number of relations exist 
such that at least one is satisfied in any group with r generators. 
Groups with no relations exist and are called free groups. This 
relatively simple state of affairs is not obvious. In a free Lie ring 




Generators and Relations in Groups—The Burnside Problem 46 


there are many nontrivial relations [12], and the writer knows of no 
satisfactory definition of a free field. 

Let I be an index system. For the most part we shall be inter¬ 
ested in the case in which there are only a finite number of indices 
and I is the set 1, 2, . . , , n. For an infinite set I of indices we 
shall assume the indices to be well ordered. We define a set S of 
elements S{ = ig J and their formal inverses sf 1 . A word or 
string is either void (written 1) or a finite succession a x a 2 • • • a t 
where each is one of the $/, c = +1. A word w is a reduced word 
if it is void or if w = a x a 2 • • * a t and no pair , i — 1, . . . , 
t — 1 is of the form sfsf*, c = ± 1. Here t is called the length of w. 
Two words/x and/ 2 are adjacent if they are of the form/i = gSj e Sj~'h, 
/ 2 = gk. The relationship of being adjacent is to be symmetric, 
that is, if fi and / 2 are adjacent, then so are / 2 and f x . We now say 
that two words / and g are equivalent and write f ~ g if there is a 
finite sequence /i = /, / 2 , . . . , f m = g such that /*• and f i+l are 

adjacent for i ~ 1, 2, . . . , m — 1. Clearly / ~ g is a true 

equivalence relation. All words equivalent to / form a class which 
we designate as [/]. 

For any word / = a x a 2 * * * a t we define the “W-process,” 
writing, 

Wo = 1, the void word 
W X =<*! 

and for i = 1, . . . f t — 1 recursively, 

W i+ i = WiUi+ 1 , if Wi is not of the reduced form Xaf+i 

“ X, if is of the reduced form Xa~+ x 

We also write W(f) = W t . It is not difficult to show that W(f) is 

a reduced word, and that if / is reduced, then W(f) = /. Further¬ 
more, if / ~ g, then W(f) — W{g), For details see Chapter 7, [16]. 
The W-process is mentioned here because it is a simple instance of 
an effective procedure for deciding an important question: when are 
two words/ and g equivalent? 

We define a product for classes of words by the rule 

( 2 - 2 ) lf][g] = [fg] 

It is not difficult to show that if /i ~/ 2 and g x ~ g 2l then f x g x ~ 
hgz. Having shown this, it follows that the product rule (2.2) is 



46 Lectures on Modem Mathematics 


well defined. Under this rule the classes of words on generators 
from the set S = {s*}, i £ I forms a group, with the void word 1 as 
the identity. If / = a x a 2 * * • a t and /* = a t ~ l ■ • • af x a{~\ 
then [/][/*] = [f*][f\ = [1] and [/*] = [f]~\ The group of these 
classes is called the free group with generators S , written F(S ). 
Note that the property of being free is not given as an abstract 
property of a group but in terms of a specific set of generators. 

Let I be an index system. We say that elements xi £ /, of a 
group G are generators of G if every element of G can be written as a 
finite product of the x’s and their inverses. Trivially, a group is 
generated by all its elements. A major theorem is the following. 

Theorem, 1. Let I be an index system and suppose that the group G 
is generated by a set of elements X:{zi}, i £ 7. Then if F(S ) is the 
free group generated by i £ 7, there is a homorphism FG 

determined by s* —> Xi, i £ 7. 

proof. Let / = ai • • • a t be any word on the elements of S . 
Consider the element g = b x * • • b t £ G where 6* = xf if a,- = $/. 
Then / —> g maps every word formed from S onto an element of G. 
Clearly adjacent and therefore equivalent words formed from & 
are mapped onto the same element of (7. Hence the mapping / —> g 
is in fact a mapping of elements of the free group F(S) onto elements 
of G. Moreover if /i —> g h f 2 -> g 2 , then fj 2 gig 2 . Hence the 
mapping > xi determines a homorphism of F onto G. 

The theorems on homomorphisms yield the following as essen¬ 
tially a corollary to the preceding theorem. 

Theorem 2. Every group G given as generated by a set X is iso¬ 
morphic to a factor group of a free group F(S ) generated by a set S with 
the same cardinal number of elements as X. 

Let K be the kernel of the homomorphism F(S) —> G of Theorem 1. 
Then if / = aia 2 • • • a% £ K, in G we have bib 2 •••&* = 1, and 
conversely. Hence the words f(xi) equal to the identity in G cor¬ 
respond precisely to the words/(s t -), which are elements of K in F(S). 

Definition. A presentation of a group G is the definition of G in 
terms of generators X = {#*■}, i £ /, and relations fj(x) = l,j £ /, 
where I and J are index systems. G is isomorphic to F(S) /K where 
F(S) is the free group generated by S = {s t }, i £ 7, and K is the 
least normal subgroup of F(S) containing all the words / ; (s), j £ J 
under the correspondence —> x t *, i £ 7. 




Generators and Relations in Groups—The Burnside Problem 47 


We say that G is finitely presented if both the number of gener¬ 
ators and the number of relations are finite. Let us note that the 
isomorphism of G with F(S)/K clarifies the question as to which 
relations are consequences of the given relations fj(x). In F(S) let 
us take all the words fj(s) and all their conjugates in F(S ), namely 
all /fc _1 /y(s)ft where h is arbitrary. Then the set of all finite products 
of elements K~ x fj(s)h and their inverses is a normal subgroup K of 
F(S). And clearly any normal subgroup of F(S) containing the 
fj($) must contain all of hT l fj(s)h, their inverses and finite products 
of these, and so K . Thus g(x) = 1 is a consequence of fj(x) = 1, 
j £ J in G if and only if in F(S)g(s) is a finite product of the hT l fj(s)h 
and their inverses. In F(S) itself K is the identity and so g(s) = 1 
in F(s) if and only if the reduced form of ^(s) is the void word. 

3. THE UNSOLVABLE PROBLEMS 

It is not within the scope of this writing to give an adequate 
account of the proofs that various problems on generators and rela¬ 
tions in groups are unsolvable. Nevertheless a listing of some of 
these problems will be given to show what cannot be done. Thus 
we can appreciate the value of those methods which solve problems 
under special conditions which are unsolvable in their unrestricted 
form. 

A rather loose description of unsolvability is the assertion that it 
is impossible to write a program for a computer which would solve 
the problem. In this context we think of a computer with an arbi¬ 
trarily long auxiliary tape which can hold the input data for any 
individual instance of the problem. 

All problems mentioned here deal with countable systems and 
are questions as to the effective decidability of certain issues. In 
this sense they are special questions on what the logicians call 
recursive enumerability. These problems can be cast as questions 
about the positive integers. A set P of the positive integers is said 
to be recursively enumerable if there are two effective rules given 
which list the integers in P and the integers in the complement of P 
respectively. In this definition a more explicit description is 
needed of what an “effective rule” is, but this requires logical 
technicalities which do not contribute further to the meaning of 
this term. We say that the problem as to whether or not an integer 
is in P is decidable if P is recursively enumerable. In a practical 




48 Lectures on Modem Mathematics 


sense this may not be satisfactory, for there may be no way of 
telling how long we must go in the listing of P and its complement 
before we are assured of finding a given integer n in one or the other 
of these lists, since recursive enumerability gives no indication as to 
the order of the lists. 

The set P of positive integers n which can be written in the form 
n = x s — y 1 with x, y positive integers may not be recursively 
enumerable, since there is certainly no obvious way of listing them 
in increasing order. But if a computable method were known for 
listing the complement, then P would be recursively enumerable. 

The basic problems deal with a finitely presented group G , where 
G is generated by a finite number of generators x\, . . . , x T subject 
to a finite number of relations /y (x i, . . . , x r ) «= 1, j = 1, . . . , m. 
The word problem is the decision as to whether or not a given word 
w(x i, . . . , x r ) is the identity in G . The unsolvability of the word 
problem was announced by P. S. Novikov [38, 39] in 1952, and his 
full proof was given in 1955. Other proofs of the unsolvability of 
the word problem for groups have been given by J. L. Britton [5] 
and W. W. Boone [3]. Boone has even shown that there is a group 
Go with two generators and thirty-two defining relations for which 
the word problem is unsolvable. The thirty-two relations could 
presumably be written out, although they would be enormously 
long. 

As a consequence of the unsolvability of the word problem, a 
number of other problems can be shown unsolvable. G. Baumslag, 
W. W. Boone, and B. H. Neumann [2] have obtained the following 
results: 

Theorem 8 . There is a finitely presented group G o such that no 
effective procedure exists to determine whether a word given in gener¬ 
ators of Go represents: 

1. an element of the center of Go; 

2 . an element permutable with a given element of Go; 

3. an nth power , where n > 1 is an integer; 

4. an element whose class of conjugates is finite; 

5. an element of a given subgroup of Go; 

6. a commutator; 

7. an element of finite order > 1. 

They show that there are many unsolvable problems on finitely 
presented groups, and follow Rabin [41] to prove a very general 
theorem. 




Generators and Relations in Groups—The Burnside Problem 49 


Theorem 4 • Let P be an algebraic property of groups ( i.e., one that is 
shared by all isomorphic copies of any group that has it), and assume 
(1) that there is a finitely presented group that has P, and (2) that there 
is an integer n such that no free group F r of rank r > n has P; then 
there is a finitely presented group G p such that no effective procedure 
exists to determine whether the elements represented by a finite set of 
words in given generators of G p generate a subgroup with P. 

Graham Higman [24] has made a further study of finitely gener¬ 
ated groups and has shown the following: 

Theorem 5. A iinitely generated group can be embedded in a 
finitely presented group if and only if it is recursively presented . 

This general result is proved independently of the earlier theorems 
and yields incidentally a new proof of the unsolvability of the word 
problem. 

From Theorem 4 the authors derive the following: 

Theorem 6 . There are finitely presented groups G ( 1 depending on 
the property considered) such that no effective procedure exists to deter¬ 
mine whether the elements represented by a finite set of words generate a 
subgroup of G that is (1) trivial, (2) finite, (3) free, (4) locally infinite, 
(5) cyclic , (6) Abelian, (7) nilpotent, (8) solvable, (9) simple, (10) 
directly indecomposable, (11) freely decomposable, (12) a group with 
solvable word problem; and so on ad nauseam (sic!). 

These theorems show that along with the word problem, a host of 
other problems are unsolvable for finitely presented groups. Since 
the word problem is solvable for finitely presented Abelian groups, 
we can decide if an element is in the derived group of a finitely pre¬ 
sented group. 

This state of affairs raises a major question. 

Problem 1 . Are there broad classes of groups for which the word 
problem is solvablef Can such groups be easily recognizedf 

There are certainly classes of groups for which the word problem 
is solvable. Magnus [32] has solved the word problem for any 
group with a single defining relation. Tartakovskii [45, 46] has 
given a solution for the word problem when the defining relations do 
not combine readily two at a time. 



50 Lectures on Modem Mathematics 


4. FREE GROUPS, FREE PRODUCTS, 

In Section 2 we gave a definition of free groups. If G{ is a set of 

groups indexed by letters i £ 7, where 7 is a well-ordered index 

* 

system, we may define the free product G* of the groups (?* in a 

t 

manner similar to that used in defining free groups. Consider words 
(or strings) 

(4.1) ai a 2y . . . , a t 

which are either void (written 1) or in which each a,*, i = 1, . . . , t 
is an element of some Gj. For these words we define two kinds of 
elementary equivalence (or adjacency): 

El : <Zi&2 * * * Q'i—iQ'iQ'i+i * * * at a\ * * * • ■ • a* 

if a{ is the identity element of some Gj . 

E 2. uiu 2 * a^ * ai 

r> ~' a\ * a{ —icx$*£Zj _|_ 2 * * * &t 

if ai and a i+ 1 are in the same Gj and a»a t+ i = af in Gj . These 
elementary equivalences are taken to be symmetric relations. Two 
words x and y are said to be equivalent if there is a finite sequence 
Xi = x, X 2 , . . . , x n = y with Xi and Xj+i elementary equivalent 
for i = 1, . . . , n — 1. We say that a word w = aia 2 • • * a t is 
reduced it is either void or if (1) no a,-, i = 1, . . . , t is the identity 
of its group Gj, and (2) a if a i+ i, i = 1, . . . , t — 1, belong to 
different groups Gj . We call t the length of w. As in Section 2 we 
can define a W-process for words and easily show that in every class 
of equivalent words there is exactly one reduced word. If we let 
[/] denote the class of words equivalent to a particular /, then the 
rule [/][<;] — [fg] gives a multiplication for the classes which can be 
shown to be well defined. Under this multiplication the classes of 
equivalent words form a group in which the void word is the 
identity. This group G is the free product of the groups Gi, i £ 7 
and we write 

* 

(? = FI i & I 


(4.2) 



Generators and Relations in Groups—The Burnside Problem Si 


Free products derive their chief significance from the following 
theorem whose proof is essentially the same as that of Theorem 1. 

Theorem 7. Let G be a group generated by the elements of a family 

of subgroups Hi, i £ I where the Hi are isomorphic to groups G^ 

* 

Then G is a homorphic image of the free product Q = Gi, i £ L 

i 

We note in passing that a free group is the special case of free 
product in which each G% is an infinite cyclic group. But there are 
certain differences in their treatment. Thus if F 2 is the free group 
generated by elements x and y the element x~ 2 y~ l x 2 y has its length 
f = 6, but in the free product of {z} and { 2 /} its length is t — 4. We 
also note that the existence of free products shows that in a general 
group G relations( 2 ) = 1 on a subset x h . . . , x r of its generators 
and other relations f k (y) = 1 on a disjoint subset of generators 
Vu ••• 9 Va are quite independent of each other in a sense which 
may be made precise by reference to the free product. 

Suppose we have a family of groups Gi, i £ I, and that various 
subgroups of different G’ s are isomorphic to each other. What 
group G is most freely generated by the Gi with these subgroups 
identified? This is the amalgamated product developed initially by 
Hannah Neumann [35]. We wish to specify in advance the inter¬ 
section of Gi and Gj for £ L This will be a subgroup of Gi and 
an isomorphic subgroupW Gj. Hence we must specify not only the 
two subgroups but isomorphism between them. We are given sub¬ 
groups Uij and isomorphisms /# 

Uij Q G{, i, j £ I 

(4.3) Uji C Gj 

IiiiUij) - Uji 

But from the property of intersections we require 

(4.4) Gi n Gj r\G k = (Gi n Gj) r\ (g { n G k ) 

- (GjnGi)n(GjnG k ) 

- (G k n g^ n (G k n Gj) 

For this to be satisfied, the subgroups and isomorphisms of (4.3) 
must satisfy 

(4.5) ijk[iij(Uij n cr*)] = iik(Uij n u ik ), i, j,ke I 


52 Lectures on Modern Mathematics 


The relations (4.5) are sufficient to insure that intersections of more 

than three G’s will be consistent with the identifications. 

* 

If Q = FI & is the free product of the (?'s, let K be the normal 

i 

subgroup of Q generated by all elements UijUj x and their conjugates 
where Iij(uij ) = Ujt, i,j £ /. Then Q/K is the desired group most 
freely generated by the Gi with the identifications desired. But we 
have no guarantee that these identifications do not have mysterious 
consequences. For example we might find that Q/K is trivial. 
Thus if we take the three groups 

Gi = {x h 2 / 1 }, x{~ l yixi = 2 / 1 2 
( 4 - 6 ) G 2 = { 2 / 2 , ^ 2 }, y<T x z 2 y 2 = z 2 2 

G% = {^3, ^ 3 }, Zs^XzZz = Xz 2 

it can be shown that each of the elements xi, 2 / 1 , 2 / 2 , z 2) 23 , xz gener¬ 
ates an infinite cyclic group. Hence the identifications xi — xz, 
Vi ~ y 2 , z 2 = zz involve the amalgamation of isomorphic subgroups. 
But with these amalgamations we have the group of (2.1) 

(4.7) G = {x y y y z}y x~ x yx = y 2 , y~hy = z 2 , z~ x xz = x 2 

which is trivial. Let us follow B. H. Neumann [34] to prove this. 
From the first relation 

(4.8) y~ l xy = xy~ x 

and so 

(4.9) y~ i zy i = xy 

Similarly, the second relation gives 

(4.10) z~ l yz l = yz~~ % 

Transforming the third relation by y y we have 

(4.11) y~ l z~ x xzy — y~~ l x 2 y 
This is 

(4.12) y~ y z~ l y • lr l xy ■ y~ l zy = y~Vy 
or, using (4.8) 

(4.13) z~ 2 xy~ l z 2 = (xy -1 ) 2 - x(y~ 1 xy)y ~ 2 

= xixy^y- 2 = x 2 y~ 3 



Generators and Relations in Groups—The Burnside Problem 63 


Now using (4.10) and the third relation to alter the left hand side 
of (4.13), we have 


x*z 2 y 1 = x 2 y 3 


z 2 = x~ 2 y~ 2 


zyz 1 = yz 


(4.14) 

From this 

(4.15) 

Also z 2 = y~ l zy gives 

(4.16) 
and so 

(4.17) z 2 yz~ 2 = zyz 1 • z = yz 2 
Substituting (4.15) into (4.17) gives 

(4.18) (x~ 2 y~ 2 )y 2 (y 2 x 2 ) = yz 2 
or 

(4.19) 
and so 

(4.20) 

Hence y and z 2 commute and the second relation gives 

(4.21) 2 = z 2 


x 2 y 2 x 2 — yz 2 


’/* = yz 2 , y 3 = z 2 


From this it is immediate that z = 1 and in turn x = 1 , y = 1 , and 
G is trivial. 

We shall say that the amalgamated product of the with 

identifications Uij = Z7/ t - exists, provided that there are no unwanted 

* 

identifications. Explicitly we require, (1) that in Q = n Gi with 

the K the kernel, K = {w~ 1 (u ij uj/)w}, GiC\ K = 1, all i <E /; and 
(2) if in Qg, (EE Gi, g } G Gj and gtg, G K, then for an appropriate 
u ij €E Uij and u,i = /«v(m< y) we have gf t - = M,-y, = ?i~- 1 . Under 

these circumstances we say the amalgamated product G exists, and 
we write 


( 4 . 22 ) 


g = YlGi-.Vu = u }i ,i,jei 



64 Lectures on Modem Mathematics 


In one simple case the conditions ( 1 ) and ( 2 ) are always satisfied, 
and the amalgamated product exists. This is the case in which all 
the Uij are isomorphic to the same group U. This case was treated 
by Schreier [44] in 1927. Let xn = 1 , . . . , . . . , be a fixed 

set of representatives of the left cosets Ux — Uxu in (?*. Then any 

* 

element g of G = J] @i : — U can be put into a canonical form 

< 

(4.23) g = ua x a 2 • • • a t 

with wGt/, each ay = ^ 1 a coset representative and ay and 

dj+ij = 1 • • • t — 1 from different (?'s. Furthermore, these are 
“reduced” words in the sense that two words of this form are the 
same group element only if they are identical. Note that this 
guarantees the existence of an amalgamated product of two groups. 

For each i £ / let Ui = { Ua\j p* i. It is clear then that if the 
amalgamated product of the G{ exists, the amalgamated product of 
the TJi exists. Hannah Neumann [35] has shown conversely that if 
the amalgamated product of the Ui exists, also that of the Gi exists. 

Problem 2. Find conditions for the existence of an amalgamated 
product of three groups . 

Problem 3 . Find a canonical form for the elements in an amalgamated 
product when it exists. 

Let G be a group generated by elements s Xf s 2 , . . . ; s r (or $*, 
i £ I if the number is infinite). What can we say about generators 
of a subgroup H ? Let x x =» 1, X 2 , . . . > % n (allowing n to be 
infinite if necessary) be representatives of the left cosets of H in G 
thus 

(4.24) G = H + Hx 2 + * • * + Hx n 
Furthermore let us define a function <£(<7) for g G G by the rule 

(4.25) $>(g) = Z, if g G Hxi 

We note that is a decision function since $(< 7 ) =1 if g £ H. 
Thus by the unsolvability of the word problem for finitely presented 
groups there is a normal subgroup H of a finitely generated free 
group F for which the function is not computable. Hence in 
using (4.24) and (4.25) we are already passing over some funda- 




Generators and Relations in Groups—The Burnside Problem 


65 


mental difficulties. Let $ be the function 4> restricted to arguments 

Q SfXj ,6 ~ ~~h 1 . 

( 4 - 26 ) <f>(SiSj ‘) = $(«,-«/) 

Theorem 8. If s it i £ I are the generators of G and xj, j E J 
are representatives of left cosets of a subgroup H of G then w 0 - = 
XiSj4>(xiSj) 1 are generators of H. 

proof. Let us first observe that the function <f> satisfies three 
simple properties. 

( 1 ) <h(xisf) is an x k 

(4.27) (2) If XiSf is an x k , then <t>(xisf) = x t sf 
( 3 ) <t>[<t>(Xisf)sj-‘] = Xi 

The first two properties are immediate. For the third we merely 
note that \ij>( Xi sf) = x k , then Xi sf = hx k for some h E H whence 
x k Sj = (h l XiSj‘)sj * = h 1 Xi, and so <t>(x k Sj~‘) = Xi . Now sup¬ 
pose that 

(4.28) a i a 2 • • at = h E H, each a m some sf 

Then identically 

(4.29) h = [$(l)a 1 4.(a 1 )- 1 ][$(o 1 )a 2 $(a 1 a 2 )- 1 ] . . . 

[$(«! • • • a m _ 1 )a m $(a 1 • • • a^) -1 ] • • . 

[$(«i ■ • • a t —i)a t $(ai - a*) -1 ] 

where we note that $( 1 ) = 1 and also $(ai • ■ • a t ) = <£(h) = l 
Thus A is a product of factors of the form 

v = $(o x • • • a m _ 1 )a m $(a 1 • • • a m ) -1 

Let fli a m— i = h\Xi for hi E H and X; a coset representative 
and let a m = «/, Sj a generator. Then if **/ = A 2 x*, A 2 E H >Xk a. 
coset representative we have 4>(ai • • • a m _{) = x&(a k • ■ ■ a m ) = 
X k = $(x iS /). Also v = htXiSfsr 1 = hlhi e H. Thus H is gen¬ 
erated by elements of the form v = i^^/) = Xi sf Xk ~\ But 
then « 1 = XiSy 'a:,- 1 = x k Sj ‘<f>(x k 8j~*) using property (3) of the 
<f> function. Hence each factor in (4.29) or its inverse is of the form 
ua — XiSj4>( Xi Sj) 1 , and so these elements generate H, 

There are two immediate consequences of this theorem and its 
proof: ( 1 ) if G is finitely generated and H is of finite index in G, then 
H is finitely generated and ( 2 ) 4> is computable from 0 . 


56 Lectures on Modern Mathematics 


At this stage there has been no restriction on the choice of coset 
representatives except for choosing the identity as the representa¬ 
tive for H . We can expect to gain certain advantages by choosing 
the representatives systematically. Let us follow Schreier [44] and 
ask that the representatives be what we call, after him, a Schreier 
System S . 

Definition . A set of words on generators s± 9 . . . , s r is a Schreier 
System S, 

1. if for ai • * • a t £ S, a x • • * a t is reduced, that is, ai ^ 1, 

i — 1> . • . y t, a%ai^\ 1, i = 1, . . . , t 1, 

2. if a\a 2 * • * at £ S, then a\ * * * a^\ £ S . 

3 . if a k a 2 • * * a% £ S, then a 2 * • • a t £ S. 

If 1 and 2 hold we call S a Schreier System. If 1, 2 , and 3 hold we 
call S a two-sided Schreier system. 

Theorem 9. If H is a subgroup of G, where G is generated by Si, 
i £ I we may choose the representatives of the left cosets of H in G to 
be a Schreier system, and if H is normal in G to be a two-sided Schreier 
system . 

The proof of this is easy and will not be given in detail. If the 
Si, i (E I and their inverses are well ordered, we may determine a 
well ordering of words ai • * * a t in the s; and their inverses by 
ordering words first in terms of their reduced length t and for 
a i • ■ • a t and b\ * * • b t both reduced of length t ordering on the 
first Oi t* bi, i = 1 , . . . , L In terms of this alphabetical ordering, 
if we choose as the representative of a coset the element which as a 
word is earliest in the ordering, the theorem is easily proved. 

Let us suppose that the representatives Xi of the left cosets of H 
in G are a Schreier system. Then if g — a x a 2 * * • a t is any word 
on the generators of G, let us define its $ form k(g) by 

(4.30) k(g) = WVaMair 1 ) ■ • * 

[$(oi • • • a m _i)a m 4>(a 1 ■ ■ ■ a m ) J ] 

($(ai • • • a«_i)a t 4>(a! • • • a*) -1 ) • $(ai • • • a t ) 

Here trivially k(g) = g as an element of G. Also from property (3) 
of the <t> function, k(gisfsj~ e g 2 ) ~ k(gig 2 ), being adjacent words in 
the u’s, and so the reduced <£ form of g defends only on the reduced 
form of g . We easily prove the following lemma: 



Generators and Relations in Groups—The Burnside Problem 67 


Lemma 1 . If the representatives yi form a Schreier system , then 
either a v = XiSj e 4>(XiSf)~ 1 is in reduced form or its reduced form is 
the identity. 

Let us delete from the mj s those which are trivial, that is, those 
whose reduced forms are the identity, and call U the remaining set 
of Uif s, and J7 ” 1 the set of inverses of these s. 

Lemma 2. Let g = A\A 2 • • • A^ be a word , possibly void , where 
each A{ is in U or U If each A is written in its reduced form 
in terms of the s/s, then omitting trivial u’s the $ form of g is 
A\ A 2 * • • At. 

The proof of this is straightforward and will be omitted. It 
depends on the observation that the 3> form of an A belonging to U 
or CT " 1 involves trivial u’s and itself. 

Lemma S. If A\A% • * • At = 1 where each Ai belongs to U or 

U 1 then A 1 A 2 * * * A t is the $ form of a word a\a 2 * * * each 

ai = sfi which is the identity , omitting trivial U{/s in the $ form. 

This lemma is an immediate consequence of the preceding. Of 
course when a word in the uy and their inverses is equal to 1 , 
replacement of the u’s by their given forms in the $/ yields a word 
equal to 1 . The force of this lemma is the converse. We can now 
state our main theorem. 

Theorem 10. Let a group G have generators s a ,o£ / and defining 
relations ffis) = 1 , j (E «/. Let H be a subgroup of G and let Xi be a 
Schreier system of representatives of left cosets of H in G. Write 
Uij = XiSj<}>(xiSj)~ l . The nontrivial u’s are generators of H. Defin¬ 
ing relations for H are the $ forms of k(xifj(s)x~ l ) = 1 , omitting 
trivial u’s. 

Proof. Everything has already been proved except the state¬ 
ment about the defining relations for H. From Lemma 3 a word 
equal to 1 in the u’s and their inverses is the $ form (omitting 
trivial u } s) of a word in the s / equal to 1. Such a word is a product 
of word g~ l fj(s)g and their inverses, but each of these is separately 
the identity. Hence the $ forms of g~ 1 fj(s)g are defining relations 
for H. There is a redundance here, however. We can write 
g 1 = hxi for some h £ H and a coset representative X{. Then 
k[g ViCs)#] = k[hxifj(s)xi l h x ] = k(h)k[xifj(s)xi~ 1 ]k(h)~ 1 where 





58 Lectures on Modem Mathematics 


k(h~ l ) = k(h)^ £ Hh being an element of H . Thus the $ forms 
— 1 are defining relations for H. 

Thus several major theorems are now corollaries of what has been 
proved. 

Theorem 11. Every subgroup of a free group is itself a free group. 

This theorem follows from the preceding theorem since the 
defining relations of the subgroup all reduce to the identity. 

Theorem 12. A subgroup of finite index in a finitely presented 
group is itself finitely presented. 

proof. Here the number of x/ s, s/s , and f a (s ) = 1 is finite, and 
so the number of w t -/s and k[xif a (s)x i ^ 1 ] = 1 is also finite. 

Theorem 13. If G is generated by s\, . . . , s r and if H is of index 
n, the number of nontrivial u^ is 1 + (r — l)n. The total length of the 
nontrivial Uij is (2 L + n)r — 2 L where L is the total length of the x’s. 

proof. There are rn Ujf s including the trivial ones. It is merely 
a matter of deciding how many trivial ones there are. From Lemma 
1, XiSj^ixiSj)- 1 = 1 if either XjSj is in reduced form and is an X& or if 
the reduced form of Xi ends in sf“ l . Hence for fixed,;' the number of 
trivial x i s j <t>(s i s j )~' 1 is the number of coset representatives ending in 
Sj or sy -1 . If we take ally = 1, . . . , r into account, this accounts 
for every coset representative exactly once except for the identity. 
Hence there are n — 1 trivial m/ s, and the number of nontrivial 
m/ s is rn — {n — 1) = 1 + (r — l)n. Similarly, if L is the total 
length of the x i} the total length of all u^ before reduction is 
(2 L + n)r and the total length of the nontrivial U{j is (2 L + n)r — 
2 L. 

The foregoing theorems stem from papers by Schreier [44] Hall 
and Rado [17] and others [15]. 

Let F be a free group with r generators xi, x 2 , . . . , x r - We say 
that F is of rank r. It is not difficult to verify that y\ = x\x^ } 
y 2 = X 2 , . . . , y r = x r are also free generators for F. Thus the 
free generators are not uniquely determined. But we can show 
that the number of free generators for F is an invariant. Let T be 
the subgroup of F generated by all squares of elements. Then as 
x~ l y~ l xy = ( x~ l y~ l x) 2 x~ 2 {xy ) 2 , we see that T contains all com¬ 
mutators. Here T is a normal subgroup of F (in fact fully invari¬ 
ant), and F/T is the elementary Abelian group of order 2 r if r is 




Generators and Relations in Groups—The Burnside Problem 69 


finite, and is infinite if r is infinite. If F has infinitely many gener¬ 
ators, the cardinality of F is the same as that of its number of gen¬ 
erators. Hence the number of generators 1 + (r — l)n given in 
Theorem 13 is the best possible value if G is free. But if G is not 
free, it often happens that fewer generators will suffice for H and 
there may be a redundancy in the defining relations for H as given 
by Theorem 10. This situation will certainly arise if one of the 
nontrivial u*/s is the identity. 

Problem 4- Let G be a finitely presented group , and H a subgroup of 
finite index . Find t under suitable restrictions if necessary , smaller 
values for the number of generators and defining relations for H . 

As an example suppose that G has generators x and y satisfying 
x 4 s= y 4 — I, and that H = G\ Here \G:H] — 16. Our theorems 
tell us that H has seventeen generators and thirty-two relations. 
But in fact these numbers may be reduced to express H as a group 
with nine generators and no relations, that is H is a free group of 
rank 9. Group G is a free product of two cyclic groups of order 4 . 
A paper by MacLane [31] gives some information on Problem 4 in 
this case. 

The preceding method of considering subgroups, which we shall 
call the Schreier method, depends on consideration of coset repre¬ 
sentatives. It has already been observed that this method assumes 
a knowledge of which elements are in which coset, although we know 
that in general this is undecidable for finitely presented groups. 

A second approach to the generation of subgroups deals more 
directly with the elements of the subgroup. This we shall call the 
Nielsen method, since Nielsen [36] in 1921 first used this procedure. 
We shall give here a refinement of the writer [15]. 

Let £ = {s*}, i £ I be the generators of a group G where we 
assume I to be well ordered. Then we may well order the gener¬ 
ators and their inverses putting < sf 1 < Sj < s ,” 1 if i < j in I. 
We introduce an alphabetical ordering gi < g 2 on elements of G by 
the following rules. Write g x = a x a 2 • • • a h each a* = 
e = ±1, g 2 = b x b 2 * * * b u . Then, (1) g x < g 2 if the lengths are 
unequal, L s (g x ) = t < u = L s {g<d ( 2 ) if L s (g x ) = L s (^ 2 ) = t, g x = 
a x a 2 * • * a tl g 2 = b x b 2 * * • b t we put g x < g 2 if a x = b X} a 2 = b 2 , 

. . . , a*_i = but a* < bi. As it stands this is strictly an 
ordering on words in the generators of (z. It is a well ordering, and 
so we may take as a canonical form for an element g the earliest 


60 Lectures on Modem Mathematics 


word in the ordering to which it is equal. If G is a free group, this 
form is merely the reduced form of g. If G is a free product of 
groups G{, we take as generators of G all elements of the free factors 
G{ except the identity. Here again the earliest word representing 
an element g is its reduced form. In general the determination of 
the canonical form of an element presents difficulties. Indeed the 
unsolvable problems tell us that this determination is sometimes 
impossible. 

In terms of the alphabetical ordering of words in the generators 
S = {si}, i E / and their inverses we may define a semialphabeiical 
ordering. For a word g of even length we write g in the form 
g = a0 _1 with Ls(a) = Ls(P). For a word g of odd length we 
write g in the form g = aa^~ x with Ls(a) = Ls(P) and a = */. 
Here we order the words by the rules, saying g i precedes g^ if 
(1) Ls(gi) < Ls(g 2 ). For words of equal length we order succes¬ 
sively on (2) the alphabetical order of a, (3) the alphabetical order 
of /?, and (4) the order of the central a. 

In a free group or a free product the reduced form of an element 
will be the earliest word equal to it and so its canonical form. 

The Nielsen process for a finite set of elements in a group G: 
g i, . . . , g m , involves the following reductions: 

1 . Delete a gi = 1. 

2. If gi _1 precedes g if replace gi by g~ l . 

3. If one of 9j v 9i% « = ± 1 , V = ± 1 , j ** h precedes g i} 

replace gi by this word. 

Here “precedes” refers to the semialphabetical ordering. Since the 
words are well ordered there are only a finite number of successive 
reductions possible. Furthermore, at every stage the elements 
retained generate the same group. Nielsen's original paper showed 
that elements in a free group which could not be reduced were 
themselves generators of a free group. This showed that every 
finitely generated subgroup of a free group was itself free. The 
restriction of this proof to finitely generated subgroups is easily 
removed. Suppose H is a subgroup of G. Our ordering is a well 
ordering of the elements of G. Let us choose a set K of elements 
from H by the rule that K contains every element h 1 of H which 
is not in the subgroup of H generated by elements of H preceding h 
in the ordering. It is easy to show that K generates H and that 
every finite subset of K is Nielsen reduced , that is, that no one of the 
three reductions is possible. Hence if G is free, the original Nielsen 



Generators and Relations in Groups—The Burnside Problem 61 


proof may be used to show that H — F (K) , the free group generated 
by the elements of K . 

* 

Now suppose that G = [J Aj y i£ / is a free product and that we 

i 

consider the elements of the A { (excluding the identity) as gener¬ 
ators of G, As just shown, let H be a subgroup of G and then 
h = {!£}, where K is defined by 

(4.31) K = [h £ H\h * 1 , h £ {A'}A' e H, h' < h] 

where h' < h means that h! precedes h in the semialphabetical 
ordering. Furthermore, every finite subset of K is Nielsen reduced. 
For k £ K, k is one of the forms 

k = a(3~~ l 

i (4.32) k = aaff~ 1 } p a, a £ Aj 

k = aaa~ l , a £ Aj 

The third form we call a transform. From the definition of K it is 
possible to draw certain conclusions as to the way in which products 
of k s combine. If <7 = a x a 2 * * • cuai+i * * * a/ and a,*, belong 
to the same free factor A h we say that a { and a i+x cancel if a t a t+ i = 1 
and amalgamate if a&i+i = of 5 * 1 . 

Lemma 4 . In a product k x *k 2 ' or k 2 "k x \ k h k 2 £ Ke, rj = ±1, 
then, ( 1 ) if ki = ap l f p does not cancel and if a cancels, then the 
adjacent term of p~ l does not amalgamate; (2) if ki = aafi^ 1 , a ft, 
a and a do not cancel and if p cancels then a does not amalgamate; 
(3) if k x = aaatT 1 , a £ Aj and if k 2 " = aa x <r with , a, a x £ A h then 
a i is the earliest element in the coset B;a x where Bj = {a { }, oca^oT 1 £ K, 
ai£Aj. 

The proof of this lemma is straightforward and may be found in 
the writer's paper [13]. It is sufficient to show that k’s of the first 
two forms in (4.32) are free generators of a free group and the gener¬ 
ators of the third form generate conjugates of subgroups of the Aj. 
Thus we are able to describe exactly subgroups of free products. 
The original proof by Kurosch used a complicated double induction. 

Theorem 14 . (Kurosch), A subgroup H 1 of a free product 

* 

^ = 11 i e 1 



62 Lectures on Modem Mathematics 


is itself a free product 

* 

H =F*Ux j ~ 1 UjXj 

3 

where F is a free group and each xf~ l UjXj is the conjugate of a sub¬ 
group Uj of one of the free factors A{ of G. 

MacLane [31] has described the free factors of Theorem 14 more 
precisely. For each i £ I the x/s such that H C\ xf~ l AiXj ^ 1 are 
a subset of appropriately chosen representatives of H-Ai double 
cosets. If [<?:#] is finite the number of free generators of F is 

1 — [G:H] + ^ ([£?:#] — m t ) where m* is the number of H-Ai 

i 

double cosets in G. 

A third method is a topological one. If G is the fundamental 
group of a complex, a subgroup H of G is the fundamental group of 
a covering of R. Baer and Levi [1] have used this approach to 
prove the Kurosch Theorem 14 on subgroups of free products. 

Of free groups, free products, and amalgamated products, the free 
groups are the simplest and most fundamental. We shall give a 
summary of further results on free groups, some of which may be 
generalized to free or amalgamated products. 

Not every Schreier system may be the coset representatives of a 
subgroup of a free group. For example, let F 2 be the free group 
generated by x and y , and V the system 1 , x, x 2 , ... , x/, .... 
If these are the representatives of cosets of a subgroup H } the ele¬ 
ment x “~ 1 cannot belong to Hx\ for then x J+1 is an element of H 
whereas we have assumed that Hx j +1 ^ H. The condition that a 
Schreier system V be a set of representatives for some subgroup of F 
is the following [see Schreier condition]: Let V be a Schreier system in 
a free group F with generators S{, i £ /. Let M(s {) and M(sf~ x ) be 
the cardinality of the sets of elements of V not of the form xsi , x £ V or 
zSj**” 1 , x £ V respectively. Then M(si) = Misf* 1 ), every i £ / is a 
necessary and sufficient condition that V be the coset representatives of 
some subgroup H of F. This comes rather naturally from the fact 
that the mappings Xj —> 4>(xjSi), xj —> <f>(XjSf where the xf s are 
coset representatives must be one-to-one on V. It is easy to show 
that in a Schreier system the cardinality of the numbers N(si) and 
N(si~ 1 ) of elements of V which are of the forms XjSi and Xjsf“ x are 
the same. Thus the condition M(si) = M(si~ l ) is automatically 




Generators and Relations in Groups—The Burnside Problem. 63 

satisfied if the number of elements in V is finite. Furthermore, our 
three conditions on the function 0(x,*/) for Xi E V, sj a generator 
are: 

<l>(xiSj € ) is an x E V 

(4.33) If Xisf is an x E V, then 4>(xisf) = 

= Xi 

The second condition prescribes the value of </> for certain argu¬ 
ments. If M($i ) = we may choose the rest of the values 

of to make xj <f>(xjSi) a one-to-one mapping of V onto 

itself and determine <K*) by the inverse mapping. Here all 
three conditions of (4.33) are satisfied, and it can be shown [15] that 
the elements of V are the coset representatives of the subgroup H 
generated by the Uij — XiSj<f>(xi8j)'~ l and the nontrivial u y s are free 
generators of H . From this the following theorem readily follows. 

Theorem 15 . Let F be a free group on generators s t -, i E I and 
w 1 any element of F not the identity . Then there is a normal 
subgroup K of finite index in F such that w (g K. 

proof. From the preceding we may take the word w and its 
initial segments as a Schreier system as coset representatives of a 
subgroup H of finite index and clearly w & H. If we represent F 
as a permutation group on the cosets of H y the kernel K of this 
representation is the desired normal subgroup. 

Corollary . In a free group the intersection of all subgroups of 
finite index is the identity. 

A group in which the intersection of all subgroups of finite index is 
the identity is said to be residually finite. In a residually finite 
group we may introduce a topology [11] calling a subgroup of finite 
index a neighborhood of the identity. 

In a free group the Schreier method and the Nielsen method of 
finding free generators for a subgroup are related. If we take as 
representative of a coset the earliest elements in the alphabetical 
ordering the w*/s (or the inverses of some of them) are the generators 
yielded by the semialphabetical ordering. This has been shown in 
[15]. 

Easy consequences of the Nielsen method are the following 
theorems. 



64 Lectures on Modem Mathematics 


Theorem 16. If a free group F of rank r is generated by r elements , 
it is freely generated by them , and mapping the generators of F onto 
these elements determines an automorphism of F. 

Theorem 17. The automorphisms of a free group F on generators 
Si, . . . , s r are generated by the following automorphisms ( 1 ) permuta¬ 
tions of the Si; ( 2 ) —► sf~ l } Sj —» Sj, j ^ i; (3) s* —» some j i 9 

Sjc * fc, k %. 

The first two types are level in the sense that the length of no 
element is changed by them. It is a curious thing that the length 
of an element in a free group is very important in most of our proofs, 
and yet it is not left invariant by most automorphisms. Indeed we 
do not know any good sense in which even on the average length is 
an invariant. Let us illustrate the difficulty. Let F be the free 
group with generators x , y . Then in F there are 4 * 3 n_1 elements of 
(reduced) length n > 1 . Consider the automorphism a of F deter¬ 
mined by a: x—+xy, y —> y. Let T n be the total length of the 
reduced forms (g)a where g ranges over the 4 * 3 n ~* elements of 
length n. Then it is not difficult to show that for n > 2 , T n + 1 = 
3T n + 14 • 3 n_1 whence T n = (14n + 4) • 3 n ~ 2 , n > 1 . But the 
total of the original lengths L n had the value 4 * % n ~~ l n. Here 
T n /L n approaches the limit { as n goes to infinity, and so in this 
sense the automorphism is such that on the average the length of 
(g)a is i times the length of g. 

Problem 5. Find a useful measure for elements , in a free group , 
related to length , which is invariant under automorphisms. 

The phrasing of this problem is of course vague since usefulness 
depends on the application desired. For example, suppose G is 
given by generators x\, . . . , x r and relations fi(x) =*••• = 
fm(x) = 1, that is, as the factor group of the free group F with 
generators s% f ... f s r modulo the normal subgroup K generated 
by/i(s), . . . ,/ m (s) and their conjugates. It would be enormously 
valuable to have a measure on/i(s), . . . , f m (s) as elements of F 
which tells us whether or not G = F/K is finite and, if finite, what 
the order of G is. This is of course asking for a great deal. Never¬ 
theless, in at least one very special case this is possible. Coxeter 
and Moser [p. 37] have shown that if G is generated by x and y sub¬ 
ject to relations 


x* = y v = (xy) w - 1 




Generators and Relations in Groups—The Burnside Problem 66 


then G is infinite if 

(4.34) - + - + — — 1<0 

U V w 

whereas the order of G is g if 

111 2 

(4.35) 0<- + - + -- l= - 

u v w g 

The proof depends on a geometric argument involving the cover¬ 
ing of a Euclidean, Riemannian, or Lobachevskian plane by 
triangles. 

The most obvious invariant length for an element g of a free 
group F is the minimum length of ( g)a as a ranges over all auto¬ 
morphisms of F. A procedure which determines this and consider¬ 
ably more is due to J. H. C. Whitehead [47]. The problem is to 
find a finite set T of automorphisms of F such that by successive 
applications of automorphisms of T to g we may reduce the length 
of the image (g)a until no further reduction is possible. Of course 
the critical part of the argument is the proof that if L[(g)a ] > L(g) 
for every a (E T, this is true for every automorphism a. The 
original proof by Whitehead was entirely based on topological 
arguments. A direct algebraic proof was found by E. S. Rapaport 
[42], and a still better proof by P. Higgins and R. Lyndon [ 21 ]. 

Theorem 18 . ( Whitehead ). Let F be the free group generated by 

S — {si, . . . , $ r }. The level automorphisms of F are those given 
by mapping Si —> «/*(*>, e t - = ±1, i = 1, . . . , r where i —► j{i) is a 
permutation . Let a be any one of the letters Si t . . . , s r , s^ 1 , . . . , 
s r _1 . A Whitehead automorphism (A y a) of F maps a into a and each 
Si 9* a, aT 1 onto one of Si f s t a, or a~ l Sia, Here A is the set of 

the letters consisting of a and those letters y j* a, d~ x which are carried 
into either ya or aT x ya y whence if x—> a~ l x y then x~ 1 £ A, x ^ A. 
Let T be the set of level automorphisms and Whitehead automorphisms . 
Let g\ } g 2 . • . , gm be a finite set of elements of F. If the total length 
of the g’s y Ls(gi) + * * • + Ls(g m ) isznot reduced by any auto¬ 
morphism of T } the total length of the g’s is not reduced by any auto¬ 
morphism of F y and an automorphism which leaves the total length 
unchanged is a level automorphism . 

This theorem gives an effective algorithm for finding a canonical 
form for a finite set of elements gi, ... f g m under automorphisms 




66 Lectures on Modem Mathematics 


of F. We apply an automorphism of T if this reduces the total 
length of the g’s, and take the new set and apply the automorphisms 
of T to this. When no automorphism of T reduces the total length 
we are finished, and the result is unique up to a level automorphism. 

It is clear that a set of automorphisms sufficient for the validity of 
the theorem must generate all automorphisms, since every nonlevel 
automorphism reduces the length of some element of F . Hence 
presumably the set should contain the generating automorphisms of 
Theorem 17. But these are not enough. If F is generated by 
x, t/j z, the Whitehead automorphism a : x —> x } y yx* 1 , z-+ xz 
reduces x~ x y~ l z~~ l xyz to y~'z~ l yz but one of the automorphisms 
of Theorem 17 reduces its length. 


5. TARTAKOVSKirS LIMITED SOLUTION 

OF THE WORD PROBLEM 

In a series of three papers which appeared in 1949, V. A. Tarta- 
kovskii [45, 46] gave a method for solving the word problem for a 
certain class of finitely presented groups. It is to be presumed that 
his goal was the solution of the word problem for all finitely pre¬ 
sented groups. Novikov’s proof in 1955 that this is impossible by 
no means destroys the value of Tartakovskii’s work. Indeed, 
Novikov’s proof may be regarded as a justification for the special 
conditions imposed by Tartakovskii and raises the problem of 
widening these conditions. 

Tartakovskii deals with relations imposed on the free product of 
cyclic groups, some of which may be of finite order, some of infinite 
order. But the sketch of his argument presented here is based on 
the presentation of a group as a factor group of a free group. In 
several respects this simplifies the line of argument, although the 
results are in some instances less general. We must give some 
notation and terminology before we can even formulate the results 
obtained. 

Let G be a group presented as the factor group of the free group F 
generated by Si, . . . , s T) modulo the normal subgroup K of F 
generated by the finite set fi(s), . . . , f m (s ) and their conjugates. 
This is equivalent to saying that G is generated by elements x\, 
. . . , x r subject to defining relations fi(z) = 1, . . . , f m (x) = 1. 
Our problem then is to develop an algorithm to decide whether or 
not a word g(x) is the identity, or alternately to decide whether or 



Generators and Relations in Groups—The Burnside Problem 67 


not g(s ) is in K , with the canonical homomorphism determined by 
, i = 1 , . . . , r. 

If 9 — 9i = ^ 1^2 * * • a n is a reduced word in the free group F, 
then a x l ga x = a 2 • • • a n a h (a x a 2 )~ l g(a x a 2 ) — a z • * * a n a x a 2 , and 
similarly < 7 * = a t ai + i * * * a n aia 2 * • * a*__i for each i = 1 , 2 , 
* . . , n — 1 is a conjugate of g. Thus the conjugacy class of g as an 
element of F is the same as that of its cyclical rearrangements 
e = We say that g = aia 2 • • • a n is cyclically reduced 

if g is reduced and also a n ^ a x ~ 1 , in which case the cyclical rear¬ 
rangements are also cyclically reduced. A word g = aia 2 * • • a n 
in which a x is considered as following a n is called a cyclic word . If 
we wish to make the distinction we write g = a x a 2 • • • a n for the 
ordinary or linear word and g = [aia 2 • • * a«| for the cyclic word. 
Here a{a^ x • • • a n a x a 2 • * • a^ x is the split of the cyclic word 
between a*_i and a t *. And so we may think of the letters of a cyclic 
word as being on a circle to be read in a clockwise direction, and we 
obtain one of its linear forms by splitting between any two consecu¬ 
tive letters. It is easy to see that in a class of conjugate elements in 
the free group F there is a unique cyclically reduced cyclic word, as 
determined by any one of its linear splits. 

Tartakovskii finds it convenient to use three terms for different 
kinds of equality. Let A = a x a 2 • • • a* and B = M 2 • • • b n be 
two words in F. We say that A and B are graphically equal (or 
equal as written) , if t = and a% = 2 = 1 , . . . , t and write 

-4 » B. If A and B are the same element of F (i.e., their reduced 
forms are equal), we say that A and B are freely equal and write 
A = B. We similarly speak of graphical and free equality for 
cyclic words. If AB~ l <£ K, that is, A(x) = B(x) in G } we say that 
A and B are equal in the group and write A = B. 

If we split a cyclic word A at a certain point, a cyclic word 5ata 
certain point, take the product of these linear words, and then form 
a cyclic word, we call this operation composition and write the result 
as A * B. If A = |axa 2 • • • a t |, B = |M 2 * * * b n |, then A * B = 
\ a i a i+i * • * a t a x • • • ai_ x bjbj + i • • • Mi * * * &/—i| for appro¬ 
priate i and j. We note that A * B = \x~ l Ax • y~ 1 By\ with 
x = a i ’ * ’ a i—b y = b 1 ■ • • bj- X . We can write A^ * B(j) if 
we wish to indicate the points of splitting. It is of course true that 
the result of composition depends very much on the points of 
splitting, but for the most part we will not indicate these. Any 
equality which we write between compositions will mean that the 


68 Lectures on Modem Mathematics 


splitting points must be chosen appropriately. Composition is 
commutative: 

(5.1) A *B « B*A 

This of course means that with the same splitting points the two 
are graphically the same as cyclic words. We have in addition a 
relation of quasi-associativity , 

(5.2) A * (B * C) « (A * B) * C or (A * C) * B 

Note that A * (B * C) is of one of the four forms (1) \A B C\, 

(2) |B A C\, (3) |BiA B 2 C\ with B * B X B 2 or (4) |B C X A C 2 \ 
with C « C 1 C 2 . With this observation the validity of (5.2) follows. 

Tartakovskii solves the word problem for finitely presented 
groups Gj where G *= F/K,F the free group generated by s h . . . $ r , 
and K the normal subgroup generated by /i(s), . . . , f m (s) and 
their conjugates, providing the defining words /i($), . . . , / m (s) do 
not combine too readily. This means that one or the other of two 
precise conditions holds. Without loss of generality we may take 
fu •••, fm to be cyclically reduced cyclic words A 1 , . . . , A m . 

If A = a\a 2 • * * a t) then A~ x = af~ l * * * ai~ 1 and the composi¬ 
tion A * A” 1 where we divide at corresponding points is 

(5.3) |a;a t - +1 * • • a t a x • • • a,-_i • ■ * * aC x a t * • • a^l = I 

Of course if A is a power there may be more than one splitting with 
A * A -1 freely equal to the identity. 

Excluding the compositions A * A -1 freely equal to the identity, | 
let mij be the greatest length of A* which is canceled in any composi¬ 
tion Ai * Aj or A{ * Ay - " 1 . Further let Ui be the length of A». 
Define 5 by 

VY) * ■ 

(5.4) 5 = max — > all i, j 

Ui 

Tartakovskii 7 s First Condition . The word problem is solvable for 
Gif 8 < i. 

Now suppose we can write A in the form 

(5.5) A = A ( 1 ) A (2) • • • A (fc) 

where, excluding compositions A * A~ l freely equal to 1, there is a 
composition A * Ay or A * Ay "" 1 which completely cancels A (u) for 



Generators and Relations in Groups—The Burnside Problem 69 


an appropriate j depending on u for each u - 1, . . . , k. The 
minimum k for all Ai, i = 1, . . . , m, and all such decompositions 
we shall call Jc, the index of erosion. 

Tartakovskii’s Second Condition. The word problem is solvable for 
G if k > 6. 

The first condition is really included in the second. There is 
simply a slight difference in the viewpoint. Both conditions are 
precise formulations of the loose assertion that the defining relations 
j(x) = 1 ,3 = 1, . . . y m do not combine too readily two at a time. 

The major part of Tartakovskii’s work does not depend on his 
conditions but is applicable to any finitely presented group. 
First we note that in general a few elementary reductions may be 
made. We may change the relations to equivalent ones so that 
each relation word . . . , A m is cyclically reduced. Second, if 
some of the generators say s u+b . . . , $ r do not occur in any rela¬ 
tion, then G is the free product of the group on $ 1 , . . . , s u with the 
relations and the free group on s u+1 , . . . , s r . Third, if a gener¬ 
ator (or its inverse) occurs only once in a relation, we may use this 
relation to solve for this generator in terms of the rest and then have 
a group with one less generator and one less relation. 

There are a number of theorems on compositions. We assume G 
to be given as F/K where s b . . . , s r are the free generators and 
fi(s) = Aiy i == 1, . . . , m are cyclically reduced words which 
with their conjugates generate K y and that every Sj occurs in at least 
one Ai ,. Any word which is in K (i.e., is the identity in (?) is called a 
Dyck word . 

Theorem 19. Every Dyck word is freely equal to a composition of 
the defining words fi(s) ~ A i = 1, . . . , m and their inverses . 

proof. If B and C are Dyck words so are their inverses, and 
their product BC is a composition. Also if B is a Dyck word and 
x is a generator, then by assumption there is an A { which contains 
the generator x or x 2 . Hence A% (or Ai -1 ) is cyclically equivalent 
to a word xT. But then the composition T^x^BxT is cyclically 
equivalent to x^ l BxTT~ l , and this is freely equal to x^Bx. This 
is sufficient to prove the theorem. Conversely, it is trivial that any 
word freely equal to a composition of the Ai and their inverses is a 
Dyck word. The foregoing theorem says that we may confine our 
attention to a study of compositions. 


70 Lectures on Modem Mathematics 


Theorem 20. {Commutativity and Quasi-associativity) | A * £| 
and \B * A\ are graphically equal. \A * (B * C)\ is graphically 
equal to one of \{A * B) * C\ or |(A * C) * B|. 

These results we have already proved. 

Theorem 21. {Monotone Compositions) Any composition of words 
A ly A 2 , . . . y A n {not necessarily distinct) is graphically equal to a 
monotone composition 

|(((* • • (A h * A i2 ) * A,-,) *•••)* A in )| 

where i h i 2y . . . , i n are 1, ... y n in some order and where we may 
choose the first factor A^ arbitrarily. 

This theorem follows from the preceding without much difficulty. 
To simplify our notation we write A^* A j 2 * • • • * A *- n for the 
monotone composition of this theorem. 

In a composition A i * • • • * A ny a segment of consecutive letters 
coming from a factor Ai and bounded by letters from other factors is 
called an arc. As we construct A\ * • • • * A{ y i — 2, . . . , n, 
when we insert a new factor the splitting point may be between 
two previous arcs, and the number of arcs is increased by one, but if 
the splitting point is in the interior of a previous arc we increase the 
number of arcs by two. This gives Theorem 22. 

Theorem 22. In a graphically free composition of n factors there are 
at most 2n — 2 arcs. 

Suppose in a composition P = A i * * * * * A n two factors 
A^ Aj are such that AiAj — 1 and that in the composition P we 
have consecutive segments ... A+MAj ... , where M is a 
segment freely equal to the identity A'iAj = 1 and Ai y Aj are 
split so that Ai = A"A’i and Aj = AjA'/ are graphically inverse. 
We say that the composition P is complex. A composition P is 
simple if it is not complex. Note that if A = aba~ l babB = A~ l = 
b~ 1 a~ 1 b~ 1 ab~ 1 a “ 1 , the split A" ~ a~ l bab , A f = ab , B' — b^aT 1 , 
B n = b~ 1 ab~ 1 a~ 1 does not make the composition complex since 
A"A r and B'B" are not graphical inverses. 

A main feature of Tartakovskii's approach is to reduce his 
algorithms to an examination of simple compositions. In reducing 
a composition to a reduced cyclic word, there is no apparent way of 
estimating the amount of cancelation if this includes cancelation of 





Generators and Relations in Groups—The Burnside Problem 71 


factors with parts of their own graphical inverses. But in a simple 
composition under appropriate conditions this is possible. 

The next feature of Tartakovskii’s attack is the theory of extinc¬ 
tion . This is based on the alphabetical ordering of linear words 
introduced in Section 4. Tartakovskii orders s* -1 < sf" 1 for a 
generator $*, instead of s* < sf~ x f but this is an unimportant tech¬ 
nical convention. The ordering of a word is the ordering of the 
reduced word to which it is freely equal. 

If A > B and A is equal to B in the group G, A s B (i.e., 
AB~ l £ K in F) we say that A = B extinguishes A. The set of all 
extinguishable words is the domain of extinction of the group G, and 
the words which are not extinguishable are the fundamental domain , 
If we write 

(5.6) F = K + KX 2 + • • - + KXi + * • • 

where each coset representative Xi is the earliest element in its coset, 
the coset representatives Xi are the fundamental domain of G = F/K 
and all other words are in the domain of extinction. If for every 
word A in F we can find an earlier word B equal to A in G = F/K or 
show that A is in the fundamental domain, we shall have solved the 
word problem, since any word A has only a finite number of pred¬ 
ecessors in the alphabetical ordering. This is what Tartakovskii 
does when one of his conditions holds. 

Let P be a fixed reduced word. The set of all words xPy that 
are reduced as written is called a ray y and P is its kernel . If every 
word in x P y is extinguishable, this is called a ray of extinction . 
Trivially, if P is extinguishable, then x P y is a ray of extinction. 

Let R be a linear Dyck word, but not necessarily reduced. Sup¬ 
pose R ~ PQ~ l where P > Q } that is, P > Q where P y Q are respec¬ 
tively the reduced forms of P and Q. Then xPy is called a left ray 
of extinction of R. The set of all left rays of extinction which can 
be formed from linear words of the cyclic word \R\ is called the left 
star of extinction of R. The right star of extinction of R is the left star 
of extinction of R~ : 1 and the star of extinction of R is the sum of 
its left and right stars of extinction. 

Theorem 23. The domain of extinction of the group G is the sum of 
all the stars of extinction of all completely reduced Dyck words. 

The proof of this theorem is easy and is merely a formalization of 
the main problem; the real difficulties lie ahead. 



7# Lectures on Modem Mathematics 


Let B i, B 2 , . . . , B t be a set of cyclic words not necessarily dis¬ 
tinct nor reduced. Let 

H < it < • ’ • < ikt fc = 1, . . . , t 

be an ordered subset of 1, . . . , t, and let [B iu P t * 2 , . . . , B ik ] be a 
subset (base) of B h . . . , B t . Then the set of all the compositions 
which can be formed from the base will be designated by { 5 tl , P* 2 , 
. . . , Bi k \. The sum of the stars of extinction of all compositions 
of the bases will be called the zone of extinction, of the base B\, . . . , 
B t and written Z[B h . . . , B t \. 

The next theorem is a major one and requires a long sequence of 
preliminary lemmas of some difficulty. This is the longest and 
hardest part of the work. 

Theorem 24 . The star of extinction of a complex composition is 
contained in the sum of the stars of extinction of simple compositions of 
fewer factors. 

If P = A x * • • • *A n is complex with respect to A* and Aj , 
it is shown that the star of extinction of P is contained in the zone of 
extinction Z[A lf . . . , A^ x , A i+l , . . . , . . . , A n ]. 

The next part of Tartakovskii’s approach is a systematic way of 
finding the reduced form of a composition P = ^ 4 1 * • • • * A n . 
Suppose there are N arcs in P. We know from Theorem 22 that 
N < 2 n — 2 . Starting with the first arc of A h let the arcs be called 
Ci, C 2 , . . . , Cat, where of course as we move clockwise around the 
cyclic word, C 1 follows upon Cat. Thus |P| == |CiC 2 C 3 • • • Cat|. 
Our first cancellation is to cancel as much of C\ as can be canceled 
by C 2 and so if C x - C n C 12) C 2 = Ct£C 22 , P = |CnC 2 2 C 3 • • • C N \. 
If Ci 1 is the identity, we say that C x has been annihilated. If 
Ci 2 is the identity, there is no cancellation. Next we cancel 
as much of C 22 as we can with C 3 C 22 = C 22 iC 2 22, C 3 = C^ 2 1 2 C 32 , 
P = |CiiC 22 iC 32 C 4 • • • Cat|. In this manner we move clockwise 
around the circle as many times as necessary, canceling on the left 
as much as is possible from an arc which has not been annihilated. 
On the first circuit of the circle there are at most N of these cancel¬ 
lation steps, called reductions, since there are N arcs. After the 
first circuit no further cancellation is possible on the left between 
two arcs which were adjacent on a previous circuit; hence further 
cancellation is possible only when an arc has been annihilated. 
Since at worst all arcs are annihilated there are at most N reductions 




Generators and Relations in Groups—The Burnside Problem 73 


possible beyond the first circuit. Let E be the total number of 
reductions. Thus we have shown the following theorem. 

Theorem 25 . If E is the total number of reductions required to put 
the composition P = A x * • ■ • * A n into reduced form , and P has 
N arcs, then E < 2 N < 4n — 4. 

In his third paper Tartakovskii is able to improve on this theorem 
in the following way. 

Theorem 26 . If P = A i * • * ■ * A n , there is a composition of 
Ai f . . . , A n freely equal to P for which the total number of reductions 

E satisfies E < 3n — 2. 

We have come this far without using the conditions at all. But 
now the application of the conditions is easy. We will use only the 
second condition, the first being a consequence of the second. Let 
P — Bi * * • • * * B n be & simple composition of the defining 

words fi(s) = Ai y i — 1 , . . . , m, and their inverses satisfying 
Theorem 26. Let us list B h B% 9 ... 9 B n as cyclic words and at 
each reduction of P let us delete from the two corresponding B’s the 
letters which have been removed. We keep a record of the total 
number of reductions for each B , say bi reductions applicable to 
Then at the end of all reductions we have 

(5.7) b\ + 62 b n — 2 E < bn — 4 

Since the composition was simple, at no stage of reduction were we 
canceling a segment of a B equal to Ai with the corresponding seg¬ 
ment of a B equal to AC 1 - By the second condition, at least seven 
such reductions are required to annihilate all of a B. Now for each 
i — 1 , . . . , n let a = 0 if bi > 7 and c» = 7 — bi if 0 < bi < 7. 
Then from (5.7) it readily follows that 

(5.8) C\ + C 2 * ’ ’ + c n > In — ( 6 n — 4) = n + 4 

Since the c’s are nonnegative and at most equal to 7, it follows that 
there are at least (n + 4)/7 positive c’s in (5.8). But each positive 
c corresponds, by the second condition to a B which has not been 
completely canceled in the reduction and so its remaining length is 
at least 1. Hence D t the reduced form of P, has Ls(D) > (n + 4)/7. 
Usually we will have Ls(D) > n + 4, for if ct is positive and if there 
is a decomposition (5.5) which includes the segments already 
reduced in Bi, the length of what remains must be at least a. If 





74 Lectures on Modem Mathematics 


this were valid for every i, we would have Ls(D ) > n + 4. It 
might happen, however, that the only composition of B = Ai which 
cancels a particular letter is that which corresponds to cancellation 
with its own inverse. This situation arises, for example, if A\ = 
abcabc, A 2 = abab and applied to a letter c in Ai. 

Hence we obtain all simple reduced Dyck words of length at most 
M by finding all simple compositions of at most 7 M — 4 of the 
defining /*($) = A{. If g(s) is of length n, then g(s ) is in the funda¬ 
mental domain of G unless it is in the star of extinction of some 
simple reduced Dyck word of length at most 2 n. Thus a listing of 
the simple reduced Dyck words of length at most 2 n enables us in a 
finite number of steps to decide whether or not g(s) is equal to the 
identity. Thus with Tartakovskii’s conditions the word problem 
is solved. 

Problem 6 . Find other conditions under which the Tartakovskii pro¬ 
cedures yield a solution of the word problem . 

6. THE BURNSIDE PROBLEM 

In 1902 Burnside [7] wrote, “A still undecided point in the theory 
of discontinuous groups is whether the order of a group may be not 
finite while the order of every operation it contains is finite.” 
Burnside was of course discussing finitely generated groups, and his 
question can be expressed, “is a finitely generated periodic group 
necessarily finite?” The question in this form has not really been 
touched. If we assume the orders of the elements to be bounded 
and so all divisors of some integer n, the question takes the form 
which we now call the Burnside problem. 

The Burnside Problem. If a group G is finitely generated and if 
every element in G has finite order dividing an integer n, is G finite? 

If we let F r be the free group with r generators s u . . . , $ r and N 
be the normal subgroup of F r generated by all elements x n , x £ F r> 
then B(n, r) = F r /N is clearly a group generated by r elements in 
which Z n = 1 for every Z £ B(n, r) and so has the properties 
assumed in the problem. Clearly every group G generated by r ele¬ 
ments in which Z n = 1 for every Z £ G is a homomorphic image of 
B(n , r) and so our problem may be put as “Are the groups B(n f r) 
finite?” It is known that the groups B{n } r) are finite when n ~ 2, 





Generators and Relations in Groups—The Burnside Problem 76 

3, 4, 6 [16, p. 320], and Novikov [40] has recently announced that 
the groups B(n y r) are infinite if n > 72. But Novikov's full proof 
has not as yet been published, and it is rumored that he may have 
to increase the number 72 to some larger number. Later in this 
section we shall discuss his announcement which depends on an 
analysis following the lines of Tartakovskii's work given in Section 5. 
But the groups B(n , r), although finitely generated are not finitely 
related, and more seriously, the Tartakovskii conditions fail to hold. 

A weaker form of the Burnside problem is known as the restricted 
Burnside problem: For given integers n and r is there a number b n>r 
such that every finite group with r generators of exponent n has order at 
most b n ,r f 

Note that an affirmative answer to the Burnside problem is auto¬ 
matically an affirmative answer to the restricted problem. We also 
note that if F r is the free group with r generators and if N\ and N% 
are normal subgroups of finite index in F r such that each of F r /N\ 
and F r /Nz is of exponent n, then this is also true for F r /N with 
N = Ni C\ N 2 , since N is of finite index and contains all nth powers. 
Hence if the answer to the restricted Burnside problem is affirma¬ 
tive, there is a single largest finite group F r /N with r generators of 
exponent n such that every other such group is a homomorphic 
image of it. 

It has been shown by Kostrikin [27, 28] that the answer to the 
restricted Burnside problem is affirmative when n is a prime. This 
we shall discuss in a later section. We note that for values of n for 
which both Kostrikin's result and Novikov's are valid there is a 
largest finite group, but the Burnside group J3(n, r) is infinite. This 
situation presents some complex implications and means in particu¬ 
lar that a finitely generated group of prime exponent exists which 
has no subgroup of finite index. 

For n = 2 it has already been noted that 

x^y^xy = (z~ 1 y- 1 x) 2 x~*(xy) 2 

and so B( 2, r) = F r /A 2 is Abelian if N 2 is the fully invariant sub¬ 
group of F r generated by all squares. Hence B(2, r) is the ele¬ 
mentary Abelian group of order 2 r . 

In his original paper Burnside proved the finiteness of the groups 
B(3, r) but their correct orders and precise description were given 
by Levi and van der Waerden [30], and may also be found in the 
writer's book [16, p. 320]. These results are expressed in terms 



76 Lectures on Modem Mathematics 


of commutators, where for arbitrary elements of a group G we write 
(x, y ) = x~ l y~ l xy, (x, y, z) = ((x, y), z ), (x, y, z, w) = ((a, y, z), w), 
(x, y,z,w) = ((x, y), (z, w)). Then in a group G of exponent three 
the following relations on commutators hold: 

( y, *) = (x, yT 1 
(x, y- 1 ) = (x~ l , y) = (x, y)- 1 
(6.1) (x, y, y) = (*, y, x) = 1 

(x, y, z) = (y, z, x) = (y, x, z)~ l = (x, z, y)~ l 

(x, y, z, w) = {x, y, z, w) = 1 

Of these relations the first is an identity in any group. The rest 
may be found directly by calculations involving two, three, and four 
generators. The last of these is easily derived from the others in 
the following way: 

(x, y; z, w) = (x, y, (z, w)) = ((z, w), x, y) = ((z, w, x), y) 

( 6 - 2 ) = (x, z, w, y) = (x, z, y, w ) -1 = ((x, y, g)~\ w)~ Y 

= (x, y, z, w) 

But also, 

(6.3) (z, w; x, y) = ((x, y), z, w) = (x, y, z, w) 

Hence (x, y; z, w ) = (z, w; x, y) = ( x , y, z, w)* 1 . But as (x, y; z, 
to ) 8 = 1 , it follows that (x, y,z,w) = 1 and with ( 6 . 2 ) the last 
relation is established. 

It follows from the relations (6.1) that in B( 3 , r) generated by 
xi f ... y x r every element g can be put in the form 

(6.4) g = xi°* • • • x r a '(x h x 2 ) bn ■ ■ ■ fa, Xj) h v • • • 

(Xi, x h x ) c ‘> 1 ■ ■ ■ 

where for (x t! x,j) i < j and for (a:,-, Xj, xu) i < j < k and the expo¬ 
nents are 0, 1, or 2. For g can be expressed by some word g = 
M 1 M 2 • • ' u m where each «; is an xj‘, and using the identity wv = 
vw(w, v ) we may first move all zi’s to the left, then all x 2 ’s etc., until 
we have fir as a product xi* 1 ■ • • x/ • T where T is a product of 
commutators, and from the relations of ( 6 . 1 ) a commutator (w, v ) is 




Generators and Relations in Groups—The Burnside Problem 77 


either the identity or equal to one in (6.4) or its inverse, and the 
commutators themselves commute. If the elements of (6.4) for 
different exponents were not all different, then by putting appropri¬ 
ate s/s equal to 1 there would be such an equality involving only 
three generators, and direct calculation in B( 3, 3) shows that this is 
not the case. The foregoing results may be stated as a theorem. 

Theorem 27 . The group B(2, r) is the elementary Abelian group of 

order 2 r . The group B(3, r) is of order 3 m(r) , m(r) = r + 

and its elements have the canonical form of (6.4). 

In Burnside's original paper he proved that £(4, 2) is finite and 
of order at most 2 12 , and this is indeed its true order. Sanov [43] 
proved B( 4, r) finite for all r by introducing certain intermediate 
groups. If B( 4, r) is generated by x u ... , x r , let us define for 
every r, H 2r = B(4, r) and H 2r+X = {B(4, r), x 2 +x ). Thus for 
each i, H i+X = {H t , x\ with x 2 E H h since for H 2r + i, x = x 2 +x and 
x 2 = 1 and for H 2r + 2, x = x r+x . We prove by induction on i that 
Hi is finite. Trivially H 2 is of order 4. 

Suppose that H = is of finite order M*. Then every element 
g of H i+ i can be put in the form 

(6.5) g = hoxh x xh 2 x • * * xh n _ x xh n 

with hj E H = jy,-, since x 2 G and ^7^ 1, i = 1 . . . , n — 1. 
There are M 2 (Mi — l) n-1 elements of this form, and so if we can 
find a bound on the n needed for all elements of Hi +X we will have 
shown that x is finite. This is what we shall do. 

From the relation ( hx ) 4 = 1 we have 

(6.6) xhx = h~ 1 x~ 1 h~ 1 x~ 1 h~ 1 = h~ 1 x(x 2 h~ 1 x 2 )xh~ 1 

= h~ x xh*xh~ l ) h* E H 

For our purposes the form of h * is not important. But applying 

(6.6) to xhix in (6.5) we have 

(6.7) g = hoxh x x • • • zhi—ihf‘ 1 xhi*xhj- 1 hi+iz • * • xh n 

Thus (6.7) is another expression for g with n unchanged, ho, . . . hi_ 2 
unchanged, and h^ x replaced by Af-iV 1 . Thus if = 1 

we may replace g by an expression of the form (6.5) with a 
smaller value for n. Repeating this procedure we may replace 



78 Lectures on Modem Mathematics 


hi -2 by hi_ 2 (hi-ihi x ) 1 or hi^hih^i, and so on. In this way 
we may replace hi by any one of h h hih z h 2 ~\ . . . , 

h\hz * * * ^28—l^'2s 1 * * * h 4 */l<2 If h\hz * * ’ ^2*+1^2s 1 * * ’ ^2 

If any one of these is 1 we may reduce the length of g. But with 
n > Mi + 1, either one of these is 1 or two of them are equal, say, 


(6.8) hh 

• • hzs—ihzs * * * h± ^h 2 1 



= h\hz • * * h 2 t—\h^ * 

• • hC l h 2' 1 

with t > s. 

This implies 


(6.9) 

h 2 3-\-i * * * h 2 t+ih 2 t * * * ^ 2 $+2 ~ 1 



but this is an element which could have replaced A 2s +1. Similarly 
if the repetition involves h x hz • • • h 2a +\h 2 s * * • we 

find that A 2 *+2 can be replaced by 1. Thus if n > Mi + 1, we can 
express g by a shorter expression. Hence any g can be expressed 
as a word with n < M{. It easily follows that Hi+ X is finite of 
order at most Mi Mi+l . We have proved Theorem 28. 

Theorem 28. The groups B( 4, r) are finite. 

The numerical bound for the order of B( 4, r) implied by this proof 
is hopelessly unsatisfactory. Starting with 2 2 as the order of 
B( 4, 1), it gives 2 10240 as a limit on the order of B( 4, 2) although the 
true value is 2 12 . C. R. B. Wright [49, 50] has found considerably 
more information about the groups B( 4, r) which is discussed in 
Section 8. 

For n = 6 and any r, the answer to the Burnside problem is 
affirmative. Surprisingly enough the exact order of B( 6, r) is 
known for every r, although the corresponding orders for B( 4, r) are 
at present unknown. This depends on the deep results of Hall and 
Higman [19], which is discussed in Section 7. 

In concluding this section some remarks will be made about 
Novikov's attack on the Burnside problem. This depends on the 
Morse-Hedlund sequence and some variant of the Tartakovskii 
solution of the word problem, this last not being described in 
sufficient detail so that the full proof can be reconstructed from 
Novikov's announcement. 

The Morse-Hedlund sequence is an infinite sequence of 0's and l's 
which begins 


( 6 . 10 ) 


0110100110010110 . . . 




Generators and Relations in Groups—The Burnside Problem 79 


This may be described in various ways. If the sequence is 

(6.11) S = C0C1C2 • * * c n • * * 

we have the following rule for c n : 

Express n in binary form. If the sum of the digits is even, 
c n — 0, if odd, Cn = 1. 

Alternately write a 0 = 0, bo = 1, and define recursively a x = a 0 &o, 
b x — boCiQj . . • , = dfnbmf bm+i ~ Here cim is a sequence 

of 2 m digits, and as these are the first digits of a m +1, it does not 
matter whether the initial digits are considered to be in a m or any 
a* with s > m. Thus we obtain an infinite sequence whose first 2 m 
digits, for any m, are the digits of a m , and this is the Morse-Hedlund 
sequence of (6.11). Morse and Hedlund [33] have proved that 
their sequence S has the property of Theorem 29. 

Theorem 29. S does not contain a block of consecutive digits of the 
form EEe where E is a block and e is the first digit of E. 

From (6.11) we form a sequence T of the letters a, 6, c 

(6.12) T = dod \d%dz * • * d n * * * 
by the rules 

d n = a if c n c n _j_i 01 

(6.13) dn = b if Cn^n- j-1 10 

d n — c if c„c n+ x = 00 or 11 

Thus (6.12) begins, using (6.10) 

(6.14) T = acbabcacbcabacb . . . 

They prove Theorem 30. 

Theorem 80. T does not have a block of letters of the form EE. 

Now let G be a group of exponent n generated by elements a, 6, c. 
Each initial segment of T yields an element of G. If these are all 
different, then G is infinite. If two different initial segments are 
the same element, then say W\ = W\W 2 and W 2 — 1 is a Dyck 
word which is a block of T and by Theorem 30 W 2 contains no 
block of the form EE. Thus the crux of Novikov’s proof, Novikov's 





80 Lectures on Modern Mathematics 


Announced Theorem, is this: for n > 72 every Dyck word in B(n, r) = 
F r /N contains a block of the form EE. 

Thus for r > 3 the sequence T shows B(n, r) to be infinite. For 
r = 2 we may replace a, b, c by 1, 2, 3 and use them as exponents in 
x 1 y z x 2 y 1 . . . , and again conclude that B(n, 2) is infinite. The 
theorem depends on showing that in the decompositions (5.5) 
relevant to Tartakovskii’s second condition there is always a seg¬ 
ment A remaining after the Dyck word has been reduced which 
contains a block of the form EE. Thus Novikov’s analysis must be 
comparable to Tartakovskii’s, but the objective is different, namely 
to prove the existence of the block EE rather than find an estimate 
on length. 

The role of the Morse-Hedlund sequence may be replaced by the 
following sequence recently found by R. Dean [9]: 

D = xyx~ 1 y~ 1 xy~ 1 x~ 1 y • • • 

The rule for construction of Dean’s sequence D is the following. 
Let 1 stand for x, 2 for y, 3 for *“ l , 4 for y —1 . Begin with 12 3 4. 
At stage i divide all the numbers into 4 blocks of equal length 
A iA 2 A zA 4 . Stage * + 1 consists of A 1 A 2 A 3 A i A 1 A i A s A 2 . Itera¬ 
tion yields an infinite sequence x^y^x^y’* . . . , with no consecutive 
block repeat. 

7. THEOREMS ON p-LENGTH. THE 
BURNSIDE PROBLEM FOR n = 6 

A finite group G is said to be p-solvable, p a prime, if there is a 
chain of normal subgroups such that every factor group is either a 
p-group or has order prime to p, which latter we shall call a p'-group 
or non-p-group. Clearly a solvable group is p-solvable for every 
prime p. A p-solvable group has a unique upper p-series 

(7.1) 1 C No C Pi C N x • • • C Pi C Ni C P,+i • • • 

CPiQNi = G 

where N o (possibly 1) is the largest normal non-p-subgroup, P\/No 
is the largest normal p-group in G/N 0 , and generally A;/P, is the 
largest normal non-p-subgroup in G/P,, and P i+ i/Ni is the largest 
normal p-subgroup in G/Ni. Here we call the least l for which 
Ni = G the p-length of G and write l = l p = l p (G). The concept 
of p-length was introduced in a major paper by P hilip Hall and 





Generators and Relations in Groups—The Burnside Problem 81 


Graham Higman [19], and its significance and relationship to the 
Burnside problem were developed there. They begin with a funda¬ 
mental lemma. 

Lemma 5. For all i > 1, Pi contains the centralizer of Pi/Ni-x 
and Ni contains the centralizer of Nif Pi. 

An improvement of this for the p-factors is the following: 

Lemma 6 . If Fi/Ni—i is the Frattini subgroup of the p-group 
Pi/Ni_ i, then Pi is the centralizer of Pi/F%. 

The force of this lemma is that G/P{ is faithfully represented by 
automorphisms of P%/Fi induced by conjugation. Since V = Pi/Fi 
is an elementary Abelian p-group, if we write V additively the 
automorphisms of V are nonsingular linear transformations of V 
regarded as a vector space over the finite field GF(p). This is an 
important instance of internal representation in which the normalizer 
N(A) of a subgroup A is represented by the automorphisms induced 
in A . The study of internal representations has in the last few 
years been of the greatest value, and goes well beyond the scope of 
this monograph. 

Here H — G/Pi is a p-solvable group which has no normal 
p-subgroup and is faithfully represented on the elementary Abelian 
vector space V = P\/F i. 

The object of the paper is to relate the p-length of G to various 
properties of a Sylow p-subgroup S(p) of G . In general, the greater 
l(p) is, the larger and more complex s(p) must be. They relate l(p) 
to the quantities (1) b = b p where p 6 is the order of S(p), (2) c = c p 
the class of S(p) } (3) d = d p the length of the derived series of S(p), 
and (4) e = e p where p e is the exponent of S(p), that is the least e 
for which x pe = 1 for every x £ S(p). For application to the 
Burnside problem the exponent e p is the most significant of these 
properties. 

Their first main theorem is Theorem 31. 

Theorem 31. Let H be a p-solvable linear group over a field of 
characteristic p, with no normal p-subgroup greater than 1. If g is an 
element of order p n in H then the minimal equation of g is (x — l) r = 0 
where (1) if p is neither 2 nor a Fermat prime , r ~ p n ; (2) if p is an 
odd Fermat prime r — p n if H has Abelian Sylow 2-subgroups and 
p n “' 1 (p — 1) < r < p n in any case; (3) if p = 2, r = p n if H has 



82 Lectures on Modem Mathematics 


Abelian Sylow q-subgroups for all Mersenne primes q less than 2 n ; if 
q = 2* — 1 is the least Mersenne prime for which this is not so, 
2 n ~ s q < r <2 n and in any case 2 n ~ 2 • 3 < r. 

These results are best possible, and the way in which the Mersenne 
and Fermat primes enter in shows the deep arithmetical side of 
these results. These lead in turn to Theorem 32. 

Theorem 32. If G is a p-solvable group, where p is an odd prime, 
then (1) d p > t p , and (2) e p > l if p is not a Fermat prime, and 
e p > [i(t +1)] if p is a Fermat prime . Furthermore, these results 
are best possible. 

There are also similar relations involving b p and c p . For p — 2 
the situation is a little more complicated. It has been shown by 
A. H. M. Hoare [25] that for p — 2 l p < 3e p and more recently by 
F. Gross [10], that i p < 2e p , although it may be true that l p < e p . 

From Theorem 32, Hall and Higman are able to attack the 
restricted Burnside problem for solvable groups. 

Problem S n . Does there exist for each positive integer k an integer 
s(n, k) such that every finite solvable group of exponent n that can be 
generated by k elements has order at most s(n, k)f 

Let us remark that a solvable group G which is finitely generated 
and is of exponent n is certainly finite. For G/G’ is a finitely gener¬ 
ated Abelian group of exponent n and so is finite. By Theorem 13, 
G ' is finitely generated and so G'/G" is a finite group. Continuing 
each derived factor G ^ l) is finite and since G is solvable there 
is an integer t such that G (0 = 1. Thus in G D G' DC" D 
... 3 G (f) = 1, each factor group is finite and so G itself is finite. 

Theorem S3. If n = pi ai p 2 a2 * * * p r ar is the factorization of the 
integer n as a product of prime powers qi = pf*, i = 1, . . . , r, 
then the answer to the problem S n is affirmative if the answer to each S q ., 
i = 1 , . . . , r is affirmative. 

proof. We use induction on r, and take pi = 2 if n is even. 
Then n — n x q r with n\ — pi a1 • • • p*lz[ and q r = p r ° r . Here q r is 
odd, and by Theorem 32 if G is any finite solvable group generated 
by k elements of exponent n, then considering G as a p r -solvable 
group there is a bound on l Pr = t for the series (7.1) for G . But then 





Generators and Relations in Groups—The Burnside Problem 88 


G — Ni and N t /P( is a group of exponent n x generated by k elements 
and so its order is bounded by s(n x , k). Then by Theorem 13 P t is 
finitely generated by say k x elements, and since P c /Ni - x is a finitely 
generated group of exponent q r , its order is bounded by s(q r> hi) 
whence by Theorem 13 2V/_i is finitely generated. Thus in turn we 
may find a bound on the order of each factor group in the series (7.1), 
and as the number of factors is also bounded, by Theorem 32 it 
follows that there is a bound on the order of G . Thus an affirmative 
answer to S n is a consequence of an affirmative answer to >S ni and 
S qr and our theorem is proved. 

For n = 6 the known finiteness of 2, r) and £(3, r) for all r 
gives an affirmative answer to Problem S 6 . From Burnside’s cele¬ 
brated result that groups of order p a g b , p and q primes, are solvable, 
it follows that a finite group of exponent 6, being of order 2 a 3 6 , is 
necessarily solvable. Thus Se is equivalent to the restricted Burn¬ 
side problem for n — 6, and we have an affirmative answer to it. 
We may even evaluate s(6, k ). Let G be finite of exponent 6 and 
let (7.1) be the upper 2-series for G . Then a Sylow 2-subgroup is 
elementary Abelian by Theorem 27 and so centralizes any part of it. 
Thus by Lemma 5 as Pi/N$ is a 2-group, P x must contain an entire 
Sylow 2-subgroup, and so l 2 (G) = 1. Using (2) in Theorem 31 it 
follows that h(G) = 1. This may also be proved directly without 
an enormous amount of labor. Thus there are two normal series 
for G . 

(7.2) 1 E A, C A, CA, - <J 

1 £ B x C B 2 C - G 

where A h A z /A 2 , B 2 /B x are 3-groups and B u B z /B 2 , A 2 /A 1 are 
2-groups, and so A 2 /A x is a Sylow 2-subgroup of G and B 2 /B x is a 
Sylow 3-subgroup of G . 

Now if G is generated by k elements, G/A 2 is of exponent 3 and so 

of order at most 3 m(fc) , m(k) — k + + 

whence by Theorem 13, A 2 has a = 1 + (k — l)3 m(fc) generators, 
and so A 2 /A x is of order at most 2°. Similarly G/B 2 is of exponent 2 
and of order at most 2* whence B 2 has b = 1 + (k — 1)2* gener¬ 
ators, and consequently B 2 /B x being of exponent 3 has order at 

most 2 These give upper bounds on the order of a Sylow 

2-subgroup and a Sylow 3-subgroup, and so we have a bound on the 


by Theorem 27, 




84 Lectures on Modem Mathematics 


order of G, s( 6, k) 

(7.3) s(6, k ) = 2°3 !,+ (2) + (3) ) a = 1 + (k - l^**®*®, 

b = 1 + (k- 1)2* 

Theorem 33 also gives an affirmative answer to Sx 2 since this is 
known for £3 and S 4, but in the absence of satisfactory values for 
5(4, k) we are worse off in attempts to estimate s(12, ft). 

Using the Hall-Higman results as motivation, the writer [14] has 
solved the Burnside problem for exponent 6. 

Theorem 34- The answer to the Burnside problem for exponent 6 is 
affirmative. The group £(6, k) is of order 2 a 3 &+ ( 2 ) + (3) j a = 1 + 
(k - l)3* + (2) + (s), b = 1 + (k - 1)2*. 

If F k is the free group with k generators and N 2 , Nz respectively 
the subgroups generated by squares and cubes, then F k /N 2 is of 

order 2* F k /Nz is of order 3 *+( 2 )+( 3 ) and N 2 , and by Theorem 13, 
Nz has a generators and N 2 has b generators. Then if N 2 (Nz) is 
generated by the squares in #3 and Nz(N 2 ) is generated by the 
cubes in N 2 , then Nz/N 2 (Nz) is of order 2 a and N 2 /Nz(N 2 ) is of 

order Now for any x inF*, a; 6 = (x 3 ) 2 — (x 2 ) 6 and so 

x 6 £ N 2 (Nz) and x Q £ Nz(N 2 ). Hence the subgroup of F k 
generated by sixth powers, is contained in W = N 2 (Nz) (D Nz(N 2 ). 
Thus Ne C W, and F k /W is finite and has its order divisible by the 
number in the theorem. This number being the same as s(6, k ) 
given in (7.3), is the order of F k /W, but it is still conceivable that 
Nz C W and F k /N 6 is infinite. Hence if we can show B(6, k) to be 
finite, we know what its order must be. 

We assume that G = Z?(6, k) = F k /Nz is generated by k elements 
and is of exponent 6. Then there is a series 

(7.4) GDM DM' 

where M is generated by cubes of elements of G and so 6/M is 
a 3-group and indeed G/M = B( 3, k ) is of order 3 m(A:) , ra(ft) = 

k + (^j + and so M has a ~ 1 + (k — l)3 m(A) generators. 

Group M/M' is a finite group of exponent 2 and order 2°. Group 
M' is finitely generated; M is generated by cubes and is also finitely 
generated, and so M is generated by the finite number of cubes 




Generators and Relations in Groups—The Burnside Problem 85 


needed to express this finite number of generators. If x\ } . . . , x t 
are the cubes needed to generate M, they are of course elements of 
order 2. M' is generated finitely and also by x^xf^Xj, i, j = 1, 

. . . , t and their conjugates and so by a finite number of these 
conjugates. Hence M f is generated by a finite number of elements 
of the form abab , where a 2 = 1, b 2 = 1. We now turn to a lemma. 

Lemma 7. If a group) H is generated by elements tq, . . . , v n and 
if any four v’s generate a group of exponent 3, then H is of exponent 3. 

This is an easy lemma depending on the observation that the 
Levi-van der Waerden relations in equation (6.1) involve at most 
four different generators. We now need to prove the following: 

Lemma 8 . If a 2 = b 2 = c 2 = d 2 — e 2 = / 2 = g 2 = A 2 = 1 in a 
group of exponent 6, then the subgroup K = {abab, cdcd , e/e/, ghgh) is 
of exponent 3. 

Once Lemma 8 is proved, we may use Lemma 7 to conclude that 
M f is of exponent 3 and so finite, and consequently that G — B{ 6, k) 
is finite, thus proving our theorem. The major part of the paper 
consists of calculations needed to establish Lemma 8. Lemma 8 
might conceivably be put onto a high-speed computer, since it is an 
explicit, although reasonably difficult problem. It is interesting to 
see how the general problem as to the finiteness of I? (6, k) can be 
reduced to the very explicit problem of proving Lemma 8. 

Problem 7. Find more accurate descriptions of the groups B(4, k) and 
their orders . 

Problem 8 . Is there a definite integer m such that if H is generated by 
Xj, . . . , x n for any n, then H is of exponent 4 if any m of the x's 
generate a group of exponent 4f 

Problem 9. Prove the groups B(12, k) are finite using the finiteness of 
all B(2, r), B(3, s), B(6, t) for any r, s, t. 

The 3-length of a group of exponent 12 is at most two, and it has 
recently been shown by F. Gross that the 2-length is at most two. 

8. THE RESTRICTED BURNSIDE PROBLEM. 

COMMUTATORS AND LIE RINGS 

We use the notations (x, y) = x~ x y~ x xy for the commutator of 
elements x and y in a group G. When H and K are subgroups of G 





86 Lectures on Modem Mathematics 


we write (. H y K) for the subgroup generated by all commutators 
(h f k)h G Hy k E K. Given a group G the subgroups G x = G y and 
recursively G { = (Gi- h G) } i = 2, 3, , form the lower central 

series of (?*. 

( 8 . 1 ) (?«+! 2 ■ ■ • 

From the definition of the lower central series it follows that if for 
some n we have G n = G n+ x, then 

( 8 . 2 ) G n = (? n +i = G n+ 2 = * • * = G n+i = •• * 

A group (7 is nilpotent if there is an N such that G# = 1. A 
finite group G is known to be nilpotent if and only if it is a direct 
product of p-groups, which are of course its Sylow subgroups. Thus 
a group of exponent p 8 , p a prime, is nilpotent if it is finite. If G is 
a free group with r generators G n /G n + \ is a free Abelian group with 
y. r (n) generators where 



d|n 


Here p(m) is the Mobius function. This formula was found by Witt 
[48] and may be found in the writer’s book [16, p. 171]. Thus if G 
is generated by r elements and is of exponent p 8 , each of the factor 
groups Gi/Gi+i is finite and so G will be finite if and only if G is 
nilpotent. Furthermore, if Gn = 1 for some large N y and if we 
know that G n = G n + 1 for some n, it follows from (8.2) that G n = 1. 
Hence the restricted Burnside problem is answered in the affirmative 
if there exists an n such that G n = G n+ 1 . It was shown by Kostrikin 
[26] that if G = I?(5, 2), then G x $ = Gu and that the largest finite 
group of exponent 5 with two generators is of order 5 33 (or 5 34 
depending on a remaining ambiguity in his calculations). Graham 
Higman [22] solved the restricted Burnside problem for exponent 5 
and any number of generators. Kostrikin [28] solved the problem 
for prime exponent p. 

Let us represent conjugation by an exponent, writing x~ x ax — a x . 
With this notation and the convention already introduced 
((x, y) } z) = (x, y } z) there are a number of identities on 
commutators: 

(8.4) (x, x) = 1, (y, x) = (x, y)- 1 

(8.4a) (x, y, z) = (x, z) u (y, z ) = (x, z)(x, z, y)(y, z) 





Generators and Relations in Groups—The Burnside Problem 87 


(8.5) (x, yz) = (a;, z)(x, y) z = (a:, z)(x, y)(x, y, z) 

(8.6) ( x , y~ l , z) v (y, z -1 , x)*(z, x -1 , y) x = 1 

(8.7) (z, y, z)(y, z, x){z, x, y) 

= (y, x)(z, x){z, y) x (x, y)(x, z) v {y, z) x (x, z)(z, x) v 

It is known that (G,-, G ; ) C and it readily follows that the 
right-hand side of (8.7) belongs to (G 2 , G 2 ) £ G 4 . 

The identities (8.4)-(8.7) resemble very closely those for a Lie 
ring. If .ft is an associative ring we may define a Lie product [x, y] 
in ft by the rule 

(8.8) [ x , y] = xy — yx 

With the rule (8.8) and the notation [[x, y], 2 ] = [x, y, z), we 
readily verify 

(8.9) [x, x] = 0, [y, x] = -[x, y] 

( 8 - 10 ) fa + y, z] = fa, z] + [y, z] 

(8.11) [a;, y + z] = [x, y] + [x, z] 

( 8 - 12 ) fa, y, z] + [y, z, x] + [ 2 , x, y] - 0 

Of these (8.11) is known as the Jacobi identity. It has been shown 
by Witt [48] and by Garrett Birkhoff [4] that a ring L with an addi¬ 
tion, which is an Abelian group, and a bracket product [x, y] satisfy¬ 
ing the laws of (8.9)—(8.12) can be realized in an appropriate 
associative ring R with the rule (8.8). 

Following Graham Higman [23], let us show how to derive a Lie 
ring L from a group G. We choose a series of normal subgroups of 
G. 

(8.13) G = H x 2 H 2 2 H 3 • • • 3 H { 2 • • • 

with the property that for all integers i, j, we have (Hi, Hj) C H i+j . 
(Clearly the lower central series (8.1) is a special case of this.) The 
additive group of L is the direct sum of the (necessarily Abelian) 
factor groups Hi/H {+1 of the series. We call elements of HJH i+1 
homogeneous of degree *. Bracket multiplication of homogeneous 
elements is defined by 

(8.14) [0,-ffi+i, gjH j+ i\ = gr^r^iOjHi+j+i 

This equation may be extended by the bilinearity rules (8.10) and 


88 Lectures on Modem Mathematics 


(8.11) to the whole of L. The rule (8.7) in G gives the Jacobi 

identity (8.12) for homogeneous elements in L, and bilinearity 

extends this to arbitrary element s of L. For a detailed treatment 

of the relation between commutators in groups and Lie rings see 

M. Lazard [29]. This correspondence gives information bearing 

00 

on all of G only if for if = C\ Hi we have K = 1, and otherwise 

i - 1 

gives information on G/K. 

For the Burnside problem the relationship between commutators 
and Lie rings is useful only when the exponent is a prime power p 8 
and here only for the restricted Burnside problem. If G is of 
exponent p s , generated by k elements, and if L is the Lie ring associ¬ 
ated with G by the rule (8.14), then L n = 0 becomes G n = G n+ 1 , 
and by (8.2) we have G n = 1. With the same n for every such 
finite G we know there is a bound b(p 8 , k) on the order of G, and we 
have an affirmative answer to the restricted Burnside problem. 
Thus the solution of the restricted Burnside problem for prime power 
exponents is equivalent to showing that the associated Lie ring L is 
nilpotent. 

Kostrikin [27, 28] solved the restricted Burnside problem for 
prime exponent p, which begins with a result by Philip Hall [18]. 
This is the result 

p -1 

(8.15) (y,x, ... ,x) = 1 (mod G p +i) 

for any x and y in a group G of exponent p. Here the commutator i 
on the left is of weight p. In the associated Lie ring this becomes 

(8.16) [y } x f . . . , x] = 0 P - 1, x's 

In (8.15) we are using the notation ((«i, . . . , Un-i), O = ( a i> 

. . . , a n ) and similarly in (8.16). The condition (8.16) is called 
the Engel condition of order p — 1. To obtain (8.16) from (8.15) 
for all elements of L and not merely homogeneous ones requires 
a little care. For the proof of this see Graham Higman [22], 
Kostrikin shows that a Lie ring L of characteristic p satisfying the 
mth Engel condition with m < p always contains a nilpotent ideal. 
From the general theory of Lie rings it follows that a finitely gener¬ 
ated L with this property is necessarily nilpotent and so for an 
appropriate n we have L n — 0 whence G n — G n . j_i, and we have an 
affirmative answer to the restricted Burnside problem. 



Generators and Relations in Groups—The Burnside Problem 89 


Kostrikin assumes that L does not contain a nilpotent ideal, and 
then finds a nilpotent ideal in L. This contradiction proves his 
result. For a Lie ring L we define a ring D(L), the ring of inner 
derivations, in the following way. For x £ L let A (z) be the 
mapping of L into itself 

(8.17) A(x):u-^[u,x] all w E L 

Here A(x) is an endomorphism of the additive group of L. The 
ring D(L) is the ring generated by the A(x) and is of course 
associative. From the rules (8.9)—(8.12) it follows that 

(8.18) A(x + y) = A(x) + A(y) 

(8.19) A{[x, y ]) = A{x)A{y) - A(y)A(x) = [A(x), A(y)] 

Here (8.19) depends chiefly on the Jacobi identity (8.12) and the 
brackets on the right in (8.19) are those given by the rule (8.8) in 
the associative ring D(L). If L contains no nilpotent ideal, as 
Kostrikin’s assumes, the mapping (8.17) of L into D(L) is faithful. 
For simplicity Kostrikin writes x for A(x) as he has assumed the 
mapping to be faithful. Thus the Engel condition (8.16) takes 
the form in D(L) 

(8.20) x p ~ l — 0 

The condition di a 2 * - * a m = 0 in D(L) is equivalent to [u, ai } 
^ 2 , . . . , Om] — 0 in L for all u . 

The main line of Kostrikin’s argument can be expressed by 
several lemmas. 

Definition . Let c m denote an element of D(L) not zero such that 
c m b'c m = 0, i = 0, . . . , 2m — 1, for any b in D(L). 

Lemma 0. If theve is a c m , m ^ 2 f then one of the following elements 
is a Cm+i*' 

Cm f [c m a 2OT+1 c 

m ] for some a 

or > [c'md 2m+1 c'm] for some a and d 

Lemma 10. If L satisfies the nth Engel condition n < p and if 
m 0 is the greatest integer not exceeding (n — l)/2, the ideal N generated 
by a Cmo is nilpotent, and indeed N 2 = 0 . 



90 Lectures on Modem Mathematics 


The combination of these yields the following: 

Lemma 11. If D(L) contains an element c^ then L contains a nil- 
potent ideal. The rest of the work consists in finding an element c 2 . 

Lemma 12. D(L) contains an element c 9 * 0 with c 2 = 0. 

Lemma 18. D(L) contains an element c<i. 

Kostrikin begins by remarking that if v m = 0 in D(L) and if 
v m-i then f or an y w i n D(L) [w m “ 1 ] m ^ 1 = 0 providing that 
4 < m. This gives him an element 6^0 with b z — 0 . The rest 
of the argument involves some complicated calculation. This gives 
his main result, since he has a contradiction unless L is locally 
nilpotent. 

Theorem 35. (. Kostrikin) A Lie ring L of characteristic p satisfy¬ 

ing the mth Engel condition , m < p, is locally nilpotent. Conse¬ 
quently the restricted Burnside problem is answered affirmatively for 
prime exponent p. 

For exponent p®, $ > 1, present information on the restricted 
Burnside problem is very unsatisfactory. The commutator rela¬ 
tions known do not yield a genuine Engel condition. The additive 
order of elements in the associated Lie ring is p® rather than p, and 
this introduces complications. 

In groups of exponent 4, since Sanov has proved the finiteness of 
the groups B{4, k), the restricted and unrestricted Burnside prob¬ 
lems coincide. C. R. B. Wright [50] has shown that with G = 
J5(4, k) the class is at most 3 k — 1, showing that G 3 * = G 31 C +1 = 

• • • , whence by Sanov’s result (? 3 fc = 1. This of course provides a 
far more satisfactory basis for estimating the order of B( 4, k) than 
Sanov’s original work. For exponent 4 we find 

(8.21) ( 3 y, x , x, x, x, x) = 1 

for any x and y. This is the fifth Engel condition. The writer, in 
work as yet unpublished, has found a kind of third Engel condition, 
specifically 

(8.22) (x, y } z , w, w, w) =* 1 (mod Gf) 

Wright’s main result is that 

(8.23) (a h a 2y ... } a n ) = 1 (mod G n + 1 ) 



Generators and Relations in Groups—The Burnside Problem 91 


providing, (1) n > 6 and (2) four or more of the a* are equal. This 
easily gives the result that the class of B( 4, k) is at most 3k, and a 
special argument reduces this to 3k — 1. 

Problem 10. Develop a more precise relationship between Lie rings 
and nilpotent groups. Different groups may correspond to the same 
Lie ring. 

Problem 11. Using (8.6) rather than (8.7), develop a Lie theory for 
the Burnside problem rather than merely the restricted Burnside 
problem. 

Problem 12. Investigate fully invariant subgroups of free groups. 

I These are all u word groups.” In particular they include Bruch's [6] 
groups of type (g —> c), those groups in which every commutator of 
weight c + 1 in g elements is the identity. 

REFERENCES 

1. R. Baer and F. Levi, “Freie Produkte und ihre Untergruppen,” Comv . 
Math. 3 (1938) 391-398. 

2. G. Baumslag, W. W. Boone, and B. H. Neumann, “Some unsolvable 
problems about elements and subgroups of groups,” Math. Scand. I (1959) 
191-201. 

3. W. W. Boone, “The word problem,” Ann. of Math. TO (1959) 207-265. 

4. G. Birkhoff, “Re present ability of Lie algebras and Lie group by matrices,” 
Ann. of Math. (2) 38 (1937) 526-532. 

5. J. H. Britton, “The word problem for groups,” Proc. London Math. Soc. (3) 
8 (1948) 493-506. 

6. R. H. Bruck, What makes a group finitef To be published by Prentice- 
Hall, Englewood Cliffs, N.J. 

7. W. Burnside, “On an unsettled question in the theory of discontinuous 
groups,” Quart. J. Pure and Appl. Math. 33 (1902) 230-238. 

8. H. S, M. Coxeter and W. O. J. Moser, Generators and relations for discrete 
groups , Berlin (1957) Springer. 

9. R, Dean, “A sequence without repeats on x, x~\ y, y~ l ,” to appear in 
Amer. Math. Monthly. 

10. F. Gross, On the %-length of solvable groups , Doctoral Thesis, California 
Institute of Technology (1964). 

11. M. Hall, Jr., “A topology for free groups and related groups,” Ann of 
Math. 52 (1950) 127-139. 

12. -, “A basis for free Lie rings and higher commutators in free groups,” 

Proc. Amer. Math . Soc. 1 (1950) 575-581. 

13. -, “Subgroups of free products,” Pacific J. Math. 3 (1953) 115-120. 

14. , “Solution of the Burnside problem for exponent 6,” Illinois J. 
Math. 2 (1958) 764-786. 



92 Lectures on Modem Mathematics 


15. -, “Coset Representatives in free groups,” Trans. Amer. Math. Soc. 

67 (1949) 421-432. 

16. - ? The Theory of Groups, New York (1959) Macmillan. 

17. M. Hall, Jr. and T. Rado, “On Schreier systems in free groups,” Trans . 
Amer. Math. Soc. 64 (1948) 386-408. 

18. P. Hall, “A contribution to the theory of groups of prime-power order,” 
Proc. London Math. Soc. 36 (1933) 29-95. 

19. P. Hall and G. Higman, “The p-length of a p-soluble group, and reduction 
theorems for Burnside’s problem,” Proc. London Math. Soc. (3) 7 (1956) 
1-42. 

20. H. Heineken, “Engelsche Elemente der Lange drei,” Illinois J. Math. 5 
(1961) 681-707. 

21. P. Higgins and R. Lyndon, “Equivalence of elements under automorphisms 
of a free group,” to be published. 

22. G. Higman, “On finite groups of exponent five,” Proc. Cambridge Philos. 
Soc. 52 (1956) 381-390. 

23. -, “Lie ring methods in the theory of finite nilpotent groups,” Proc . 

Intern. Congr. Math. 1958. 

24. -, “Subgroups of finitely presented groups,” Proc. Roy. Soc. London 

Ser. A 262 (1961) 455-475. 

25. A. H. M. Hoare, “A note on 2-soluble groups,” J. London Math. Soc. 36 
(1960) 193-199. 

26. A. I. Kostrikin, “Solution of the restricted Burnside problem for the 
exponent 5,” Izv. Akad. Nauk SSSR 19 (1955) 233-244 (Russian). 

27. -, “On locally nilpotent Lie rings satisfying an Engel condition,” 

Dokl. Akad. Nauk SSSR 118 (1958) 1074-1077 (Russian). 

28. -, “On Burnside’s problem,” Dokl. Akad. Nauk SSSR 119 (1958) 

1081-1084. 

29. M. Lazard, “Sur les groupes nilpotents et les anneauz de Lie,” Ann. Sci. 
Ec. Norm. Sup. (3) 71 (1954) 101-190. 

30. F. W. Levi and B. L. van der Waerden, “Uberline besondere Klasse von 
Gruppen,” Math. Zeit. 32 (1930) 315-318. 

31. S. MacLane, “A proof of the subgroup theorem for free products,” Mathe- 
matika 5 (1958) 13-19. 

32. W. Magnus, “Die Idenitatsproblem fur Gruppen mit einer definierenden 
Relation,” Math. Ann. 106 (1932) 295-307. 

33. M. Morse and G. A. Hedlund, “Unending chess, symbolic dynamics and a 
problem in semigroups,” Duke Math. J. 11 (1944) 1-7. 

34. B. H. Neumann, “An Essay on free products of groups with amalgam¬ 
ations,” Philos. Trans. Roy. Soc. London Ser. A 246 503-554. 




I 

Some Aspects 
of tlie Topology 
of 3-Manifolds Related, to 
the Poincar6 Conjecture 

R. H. Bing 



I 

1. INTRODUCTION 

One of the most interesting questions in the study of 3-manifolds 
asks if the following conjecture is true. 

Poincare Conjecture. A compact connected 3-manifold is topo¬ 
logically a 3-sphere if it is simple connected. 

Our discussion will center around some of the aspects of 3-manifolds 
as they relate to this conjecture. 

The last decade has seen the solutions of many interesting and 
important problems about 3-manifolds. When we consider some 
of the remarkable results obtained, itjs a bit surprising that the 
Poincard conjecture remains unconquered. Before reviewing some 
definitions at the end of this section, we mention several of these 
problems. 

Moise showed [15] that for open subsets U of Euclidean 3-space 
E z , homeomorphisms can be approximated by piecewise linear 
homeomorphisms. If U is an open subset of E 3 , / is a continuous 
positive function defined on U, and h: U —» E 3 is a homeomorphism, 
there is a piecewise linear homeomorphism g:U—>E 3 such that 
98 



94 Lectures on Modem Mathematics 


distance [h(x), g(x)] <f(x) for each point x of U (see Figure 1). 
This result was used to show [16] that 3-manifolds can be triangu¬ 
lated. The result also yields the Hauptvermutung. If T i, T 2 are 
two triangulations of the same 3-manifold, then T h T 2 have 
isomorphic rectilinear subdivisions. Hence, in attacking the 
PoincarS conjecture, it suffices to limit our consideration to piece- 
wise linear 3-manifolds. 

Papakyriakopoulos [18] made a remarkable breakthrough when 
he proved Dehn’s lemma, which states that if D is a 2-simplex, M 3 
is a combinatorial 3-manifold, and /: D —> M 3 is a piecewise linear 
map such that for some neighborhood N oiBdD in D, :f(N) —► D 
is a homeomorphism, then/(5d D) bounds a polyhedral disk in M 3 
(see Figure 2). The theorem was proved by scissors and paste 
techniques. Using similar techniques, Papa established [17] the 
loop theorem—if M z is a triangulated 3-manifold with boundary 
and L is a polyhedral singular closed curve on Bd M z that can be 
shrunk to a point in M z but not on Bd M z , then there is a poly¬ 
hedral simple closed curve with the same property. 

Another noteworthy theorem by Papa [18] was the sphere 
theorem, which states that if S 2 is a 2-sphere, M z a compact tri¬ 
angulated 3-manifold, and /:S 2 —> M z a map of S 2 into M z that is 
not homotopic to a constant, there is a polyhedral 2-sphere in M z 



Figure 1 






Topology of 3-Manifolds Related to the PoincarS Conjecture 95 






Singular disk f(D) 


Singularities removed 


Figure 2 


which cannot be shrunk to a point in M 3 . We might hope that this 
Sphere theorem would be of some help in classifying compact 
3-manifolds, but again the Poincar6 conjecture gets in the way [19, 
14]. Papa made an attack [20] on the Poincar6 conjecture, but the 
victory was only partial. 

Bing partially bridged the gap between the study of the topo¬ 
logical embeddings of objects in triangulated 3-manifolds and the 
piecewise linear embeddings of these objects when he showed 
[4] that any surface in E z can be approximated by a polyhedral 
surface. If M 2 is a 2-manifold, / is a continuous nonnegative func¬ 
tion defined on M 2 , and h:M 2 —► E 3 is a homeomorphism of M 2 
onto a closed subset of E 3 , there is a piecewise linear homeomorphism 
g: M 2 —> E 3 such that for each point x E M 2 , distance [A(z), g(x)} < 
f(x) and g is locally piecewise linear at x E M 2 if/(x) > 0. Figure 3 
indicates how to start building an approximation to h(M 2 ). This 
theorem (with later refinements of it) were used to give [6] an 
alternative proof of Moise’s results on the triangulation of 3-mani¬ 
folds and of his results that homeomorphisms of open subsets of E 3 
can be approximated by piecewise linear homeomorphisms. 

Bing made an attack on the Poincar6 conjecture in [5] but 
achieved only a partial solution. Other partial solutions not pre¬ 
viously appearing in the literature are provided by Theorems 7, 17, 
and 18 of this chapter. 

An n-manifold is a separable metric space, each of whose points 
has a neighborhood homeomorphic with Euclidean n-space E n . 
Some mathematicians denote E n by R n since it is the cartesian 




96 Lectures on Modem Mathematics 


h(M 2 ) g(M 2 ) 



Figure S 


product of n copies of the real line. An n-manifold with boundary 
is a separable metric space, each of whose points has a neighborhood 
whose closure is topologically equivalent to an n-cell I n . Hence, 
an n-manifold is an n-manifold with boundary but not conversely. 
If M n is an n-manifold with boundary, we use Int M n to denote the 
set of points of M n which have neighborhoods homeomorphic to 
E n and Bd M n to denote M n — Int M n . If Bd M n 0, we say 
that M n is an n-manifold with nonnull boundary. It may be shown 
that here Bd M n is an (n — l)-manifold. 

An n-simplex in E m (m > n) is the convex hull of n + 1 points in 
E 171 such that these points do not lie in any n — 1 plane in E m . 
Each convex hull of a subset of the n + 1 points is a simplex which 
is a face of the original n-simplex. In an abstract metric set, an 
n-simplex is a set isometric with an n-simplex in E n . The faces of 
this abstract n-simplex are the images of the faces of its isometric 
image in E n . 

A geometric complex is a metric space which is the sum of a locally 
finite collection of simplexes such that if two of these simplexes 
intersect, the intersection is a face of each. Suppose this locally 
finite collection T of simplexes contains each face of each element of 
T, Then T is called a triangulation of the geometric complex. For 
a triangulation of a 3-manifold, the 3-simplexes are tetrahedra. We 
sometimes refer to the 2-simplexes and 1-simplexes as faces and 
edges of the tetrahedra, although strictly speaking from the defini¬ 
tions given previously, a face need not be 2-dimensional. 

A triangulated manifold is a geometric complex which is a manifold 








Topology of 8-Manifolds Related to the Poincart Conjecture 97 


together with some particular triangulation of the geometric com¬ 
plex. To triangulate a manifold, we may need to re-metrize it so 
that we have the sum of simplexes of the right sort. When we 
speak of a triangulation of E n , however, unless we indicate to the 
contrary, we mean a rectilinear triangulation of E n where the 
simplexes of the triangulation are simplexes under the ordinary 
Euclidean metric. 

Suppose Ti is a triangulation of a geometric complex C. A sub¬ 
division T 2 of Ti is a triangulation of C (keeping the same metric for 
C) such that each simplex of T 2 lies in a simplex of T x . 

The i-skeleton of a triangulation is the sum of all i-simplexes in it. 

A geometric complex may have several triangulations. A set Y 
in a geometric complex is called a polyhedron if the geometric com¬ 
plex has a triangulation (without re-metrization of the complex) 
such that Y is the sum of simplexes of the triangulation. A set Z 
is tame in a geometric complex C if there is a homeomorphism h of C 
onto itself such that h(Z) is a polyhedron. A closed set X in C 
which is not tame but is homeomorphic with a polyhedron is called 
wild. A set X is locally tame at a point p of an n-manifold if there 
is a neighborhood N of p and a homeomorphism h'.ft —* E n such 
that h(N ■ X) is a polyhedron in E n . 

A map / of a geometric complex C into a geometric complex C' is 
piecewise linear if there is a triangulation T of C such that / is linear 
on each simplex of T . The map is called locally piecewise linear at a 
point p if for some neighborhood N of p there is a triangulation T' of 
N such that / is linear on each element of T'. If / is locally piece- 
wise linear except possibly at points of a set X, we say that it is 
locally piecewise linear mod X. 

A triangulated n-manifold M n is a piecewise linear manifold if for 
each point p of M n there is a piecewise linear homeomorphism of E n 
onto a neighborhood of p in M n . Although it is known that any 
triangulation of a 3-manifold yields a piecewise linear manifold, the 
corresponding result is not known for triangulated 4-manifolds (see 
Section 8). 

If s is a simplex of a triangulation T of a geometric complex, 
Star (s, T) is the sum of all simplexes of T that contain s. If s, s' are 
mutually exclusive simplexes that are faces of the same simplex s", 
the join of s and s' is the sum of all segments in s" from a point of s 
to a point of s'. Link (s, T) is the sum of all simplexes s' of T such 
that the join of s and s' is a simplex of T. Some authors refer to 






98 Lectures on Modem Mathematics 


the link of s as the complement of s. When I 7 is a triangulation of a 
manifold and s is a vertex of T , Link ($, T) = Bd Star (s, T). 

We use I n to denote an n-cell. A topological set X is n-connected 
if each map of Bd 7 n+1 into X can be extended to map 7 n+1 into X . 
This is a bit more delicate than terminology used by some authors 
who say that a set is t-connected if and only if for each j < i f the 
set is ^-connected as we have defined it. A set is called simply con¬ 
nected if it is 1-connected. A set U is locally n-connected at a point 
p of U if for each neighborhood Ni of p there is a neighborhood N 2 of 
p such that each map of Bd 7 n+1 into N 2 ' V can be extended to 
map 7 n+1 into N\ * U. Note that p need not be a point of U. 

A compact connected simply connected 3-manifold is called a 
homotopy 3 -sphere. The Poincar^ conjecture claims that a homo- 
topy 3-sphere is a 3-sphere, but the issue is still in doubt. 

A homotopy is a continuous 1-parameter family of maps /*(0 < 
t < 1). If f:X X [0, 1] —> Y, we find it convenient to denote 
f(x X t) by We say that the maps gi'.X —> Y, g 2 :X —► Y are 

homotopic in Y provided there is a map/:X X [0, 1] Y such that 
fo = fi ~ Qz- The homotopy is called an isotopy if each f t is a 
homeomorphism. 

A constant map of X is a map that sends all of X into a single 
point. We say that a set X can be shrunk to a point in a set Y if 
there is a map f:X X [0, 1] —> F such that / 0 = 7 (identity map) 
and /1 is a constant map. 


2. EXAMPLES OF COMPACT 3-MANIFOLDS 

Perhaps the best known example of a 3-manifold is Euclidean 
3-space E 3 itself. Space E z is not compact, however, since it has an 
infinite set of points with no limit point. Examples of compact 
3-manifolds are somewhat artificial since there are no physical 
objects in X s , the world in which we presume we live, which are 
examples of compact 3-manifolds. We do have objects in E s which 
are examples of compact 3-manifolds with boundaries such as a 
cube, a solid torus, a cube with handles, a cube with a knotted hole, 
and a punctured cube. We also have examples of compact 2-mani¬ 
folds. In this section we use these real examples to describe 
examples of compact 3-manifolds which are abstract in the sense 
that in E z , there are no physical objects topologically equivalent to 
them. 



Topology of 8-Manifolds Related to the Poincark Conjecture 99 


Sewing 3-manifolds with boundaries together is a convenient way 
of constructing examples of 3-manifolds. Suppose X , F are two 
topological sets and h is a homeomorphism of a closed subset A of X 
onto a closed subset h(A) of F. Then X +& Y denotes the topo¬ 
logical set obtained by sewing X and Y together along A and h(A) 
so that a point of A is identified with its image under h. It is con¬ 
venient to think of X and Y as intersecting only in A = h(A) so 
that a coincides with h (a). An open set in X +& F is either an 
open set in X — A, an open set in Y — h(A) y or a set of the form 
TJi ■+■ Ui where U\ is an open set in X f TJ 2 is an open set in F, and 
h(Ui • A) - U* • h(A). 

1. S n . An n-sphere S n is a set homeomorphic to the boundary of 
a unit (n + l)-ball in 2J n+1 . It might have the equation x\ + 
a 2 2 + * * • + £n+i = 1* Similarly, S n is a set homeomorphic 
with Bd / n+1 * 

2. £ 3 . Since E A does not seem as real to many of us as E z , it may 
be more convenient to regard S z as the one-point-compactification 
of E z . Here, a deleted neighborhood of the ideal point at infinity 
is the exterior of a closed 3-ball—the larger the 3-ball, the smaller 
the neighborhood. 

3. S z as a toroidal manifold. Suppose T is the solid torus 
obtained by revolving in E z the disk (x — 2) 2 + y 2 < 1 in the 
xy -plane about the t/-axis. The simple closed curve Jm defined by 
z = 0, (x — 2) 2 + y 1 = 1 bounds a disk in T but not a disk on 
Bd T and is called a meridianal simple closed curve of Bd T . The 
simple closed curve Jl defined by y = 0, x 2 + z 2 = 1 bounds a 
2-manifold in E z — Int T but none on Bd T and is called a longi¬ 
tudinal simple closed curve of Bd T. Suppose 2T, T 2 are two copies 
of T with Jm u Jli, Jmu Jmz being the corresponding meridianal 
and longitudinal simple closed curves. If h is a homeomorphism of 
BdT\ onto BdTz that takes Jmi onto «/z, 2 , then Ti +a T 2 is 
topologically S z . It may be convenient to regard T 1 , T 2 as being 
two linked inner tubes plus their interiors (see Figure 4). The 
point at infinity is moved to be on the interior of T 2 . Then the 
two inner tubes are blown up so that Bd T\ and Bd T 2 coincide 
and Ti + T 2 fills up E* plus the point added at infinity. 

4. Toroidal manifold T(m, n). Suppose that m, n are relatively 
prime positive integers; T\ 9 T 2 are as in the preceding example; and 
Vu P*> • • • ) Vn are n points on Jm% cyclicly ordered on Jm 2 as 




100 


Lectures on Modem Mathematics 



T\ + h T 2 = S 3 


Figure 4 


indicated by their subscripts. Let J (m, n) be a simple closed curve 
on Bd T 2 that intersects Jm 2 only at p h p 2f . . . , p n \ crosses J m 2 
at each such point of intersection, with the pj s cyclicly ordered on 
n) as p h p m+h p 2m + 1 , . . . , P(»-d»»+i (see Figure 5). Let h 
be a homeomorphism of Bd T x onto Bd T 2 taking J M i onto J(m , n). 
Then T x + h T 2 — T(m , n). It is known that T( 1, 1) is the only 
simply connected toroidal manifold. As pointed out in the previous 
example, T{ 1, 1) is topologically S s . Hence toroidal manifolds do 
not furnish a clue as to whether or not the Poincar^ conjecture is 
true. Had T(0, 1) been defined, it would represent the cartesian 
product S 2 X 8\ 

5. Double toroidal manifolds. Suppose that in E z that X is a 
tubular neighborhood of a figure eight. It may be regarded as a 
cube with two handles. Suppose X h X 2 are two copies of X and h 
is a homeomorphism of Bd X x onto Bd X 2 , Then X x -\~h X 2 is a 
double toroidal manifold. It is not known whether or not each 
simply connected double toroidal manifold is topologically S z . 

6. n-Tuple toroidal manifolds. Suppose X be a cube with n 
handles and X h X 2 are two copies of X. For each homeomorphism 
h:BdXi—>X 2 , X x +hX 2 is an n-tuple toroidal manifold. It is 
known that each compact connected orientable (does not contain a 
solid Kline bottle) 3-manifold can be regarded as an n-tuple toroidal 




Topology of 3-Manifolds Related to the Poincari Conjecture 101 


manifold. To regard it as such, consider a triangulation of the 
manifold and let Xi be a tubular neighborhood of its 1-skeleton. 
If there is a counterexample to the Poincar6 conjecture, it lies 
among the n-tuple toroidal manifolds. It would be interesting to 
know for what integers n there is no counterexample. This attack 
seems difficult since we have only found the answer for n — I. 

7. Homology spheres. Let C 1 , C 2 be cubes with knotted holes 
(see Figure 6). Let JMi be a simple closed curve on Bd Ci such that 
Jm i bounds an orientable 2-manifold in Ci but not on Bd Ci and 
J Li be another simple closed curve on Bd Ci that intersects J Mi at 
just one point and crosses it there. If h:BdCi-^>BdC 2 is a 
homeomorphism that takes Jm 1 , J onto J x 2 , Jm * respectively, 
then Ci +h C 2 is a 3-manifold with the same homology as S z , but 
which is not a homotopy sphere. The reason that Ci +hC 2 is not 
a homotopy sphere is as follows. If a tame 2-dimension torus 
separates a homotopy 3-sphere, it follows from the Loop theorem 
and Delin's Lemma that one of the pieces into which the torus 
separates the homotopy 3-sphere has the same fundamental group 
as a circle. These homology spheres we have described give 
counterexamples to the Poincar6 conjecture in its original form 
(see Section 8). 


J Mi + Jm 2 = T(2,5) 



Figure 5 




10B Lectures on Modem Mathematics 



Figure 6 


8. A candidate for a counterexample. Suppose J Li is selected in 
the Bd Ci of Figure 6 so that J Li goes through the hole once and does 
not bound in E z — Int C\. Let T be a solid torus and h:BdCi~+ 
Bd T be a homeomorphism that takes Jl\ onto a meridianal simple 
closed curve on Bd T. Is C\ +h T a counterexample to the Poincar6 
conjecture? The fundamental group of C\ can be found to be 
G = {a, 6, c, d/cac^b" 1 = bdc^d^ 1 = acaT^dT 1 = 1}. After T is 
sewn in, the fundamental group of the sum is (?' = {cac _1 6“ 1 = 
bdc^d ^ 1 = aca —1 <T“ 1 = c^da^ba = 1}. Whether or not C\ +* T 
is a homotopy sphere depends on whether or not G' is trivial. 

Perhaps by drilling knotted holes (possibly different from the one 






Topology of 3-Manifolds Related to the Poincart Conjecture 103 


shown in Figure 6) and sewing back in solid tori, we obtain a group 
whose relations, through some quirk, makes the group trivial but 
obtains a sum that is not S 3 . Two difficulties serve as blocks to 
this approach. First, the algebra is fierce and it seems only by 
chance that we can show a group to be trivial. Second, even if the 
algebra succeeds, we still need to show that the resulting space is 
topologically different from S z . Bing has shown [5] that no counter¬ 
example results if the knot in Ci is a trefoil knot, and Hempel [11] 
has extended this result by showing that no counterexample results 
for certain other knots. The four-crossing knot shown in Figure 6 
still baffles us. 

9. Projective 3-space P 3 . Suppose B is a unit ball in F 3 and P 3 
is the space obtained by sewing each point p of Bd B onto its 
diametrically opposite point (antipodal point). Note that under 
the identification, Bd D goes onto a projective plane which is a one 
sided surface in P 3 . It follows readily that P 3 is not simply con¬ 
nected since this projective plane does not separate P 3 . In a simply 
connected 3-manifold, each 2-manifold separates it. 

10. Lens space F(m, n). Let B be the unit ball in F 3 described 
by the equation x 2 + y 2 + z 2 < 1; m, n be relatively prime positive 
integers; and F(m, n) be the decomposition space of B obtained by 
identifying the upper half of Bd B with the lower half under a rota¬ 
tion through 2nwr/n radians and a reflection through the z = 0 plane. 
Each point on z = 0, x 2 + y 2 = 1 is identified with n — 1 other 
points; each point of x 2 + y 2 + z 2 = 1, 2 > 0 is identified with one 
of x 2 + y 2 4- z 2 — 1, z < 0; and points of x 2 + y 2 + z 2 < 1 are 
left alone. It may be shown that each lens space is a toroidal 
manifold. The tori correspond to tubular neighborhoods of the 
axis and the equator. The upper and lower halves of Bd B shown 
in Figure 7 are shown making a 72° angle with each other so that 
when sewing is done the tubular neighborhood of the equator looks 
like a torus without unnecessary distortion. Note that F(l, 1) is 
topologically equivalent to S 3 . In fact L(m , n) is topologically 
equivalent to T(m, n). Each lens space is a toroidal manifold and 
S 2 X S l is the only toroidal manifold which is not a lens space. 

11. Cartesian products. Taking the cartesian product of S 1 with 
various compact 2-manifolds provides us with an abundant supply 
of compact 3-manifolds. None of these are homotopy spheres. 

12. Replacing tori. It is possible to obtain a set topologically 
equivalent to S 2 X S l by removing an unknotted solid torus from 





104 Lectures on Modem Mathematics 



Figure 7 


S 3 and sewing it back in so that the meridianal and longitudinal 
simple closed curves on the boundary are interchanged. In fact, 
A. H. Wallace [28] and W. B. R. Lickorish [10] have shown that any 
compact connected orientable 3-manifold whatever can be obtained 
by removing a finite collection of mutually exclusive solid tori and 
sewing them in differently. 

13. R. H. Fox has suggested [8] branched covering spaces of 
S z as a possible place to look for counterexamples to the Poincar6 
conjecture. 

3. IMPROVING TRIANGULATIONS 

Consider a 2-dimensional torus with a small hole. If the hole is 
large, we can easily see how to triangulate the 2-manifold with 
boundary so that all the vertices lie on the boundary. If the hole 
is small, the result remains true but is more surprising. 

There are many different ways of triangulating the same 3-mani¬ 
fold with boundary. If we triangulate a simply connected 3-mani¬ 
fold and remove the interior of one 3-simplex, what is left is called a 
fake cube. Like a 3-cell, its boundary is a 2-sphere and it is con¬ 
nected as well as simply connected. If it is a 3-cell, the original 
3-manifold was a 3-sphere, but if it is not a 3-cell, the original 
3-manifold was a counterexample to the Poincar<$ conjecture. 

It might conceivably be of some help in deciding whether a fake 






Topology of 3-Manifolds Related to the Poincark Conjecture 105 


cube is a real cube if the fake cube has a “nice” triangulation. 
Theorem 1 shows that each fake cube has a triangulation all of 
whose vertices lies on the boundary. The theorem can be proved 
for arbitrary compact, connected, combinatorial n-manifolds, but 
for simplicity we only discuss the case n — 3. 

Theorem 1 . Each compact connected 3-manifold M z with non¬ 
null boundary has a triangulation all of whose vertices are on Bd M z . 

proof. It follows from [3] that M z has a triangulation T'. If 
a vertex of T f lies in Int M 3 , we use two steps to reduce the number 
of vertices in Int M 3 . Let vv' be an edge of T' with v e Bd M z and 
v' e Int M 3 . If Link (v\ T) has a 2-simplex on Bd M z } we can 
skip the first step. 

1. Let U be the open half-star of v —that is, the set of all points of 
Star ( v } T') which lie closer to v than to the other end of the segment 
through them from v to Link ( v , T f ). Note that M z — TJ is topo¬ 
logically M z . However, T' is not a triangulation of M z — U since 
some of the original 3-simplexes of T' intersect M 3 — U in a tetra¬ 
hedron with one end blunted. Suppose T f is a triangulation of 
M 3 — U obtained by leaving the elements of T' in M z — Star (r, T') 
alone and subdividing the blunted tetrahedra so as not to introduce 
new corners and so that some 3-simplex of T' with a face on 
Bd(M z — U) has what is left of vv f as an edge (see Figure 8). 

2. Let s 3 be a 3-simplex of T' with v' as vertex and with a face on 
Bd(M z — U). Denote the closure of (M 3 — U) — s z by N z . Then 
T' has one fewer vertex in Int N z than T f has in Int M z . Since N z is 
topologically equivalent to M 3 , there is a triangulation of M z that 
has fewer vertices on Int M z than T f does. An iteration of this 
procedure yields T . 

Theorem 2. Suppose T is a triangulation of a compact 3-manifold 
with boundary M such that each vertex of T lies on Bd M and each 
3-simplex of M has at least two edges on Bd M. Then M has a 
triangulation T' such that each vertex of T' lies on Bd M and each 
2-simplex of T f has an odd number of edges on Bd M. 

proof. Suppose T is chosen so as to have a minimal number of 
2-simplexes not on Bd M while maintaining the property that all 
vertices of T lie on Bd M and each 3-simplex of T has at least two 
edges on Bd Af. We show that this minimal triangulation of M 
suffices for T\ 





Figure 8 


Assume that T has a 2-simplex V\V 2 Vz with precisely two edges 
V 1 V 2 , i>i*>3 on Bd M. We split M along this 2-simplex (see Figure 9). 
Let X be the star of Vi with respect to T (sum of all simplexes of T 
that contain ^i). Then A is a cube which is the sum of two cubes 
Xi, X 2 such that X\ * X 2 = V 1 V 2 V 3 . Split M along V 1 V 2 V 3 to get a 
3-manifold with boundary homeomorphic with M with a triangula- 



V 2 V 2 


Figure 9 




Topology of S-Manifolds Related to the Poincari Conjecture 107 


tion T i such that a simplex of is either a simplex of T that misses 
»i, a simplex of T in X h or a simplex of T in X 2 with the vertex 
replaced with a new vertex v[. Note that T' has an extra vertex 
v i> two extra edges v[v 2 , v[v 3 , and an extra face v[v 2 v 3 , but it has one 
fewer 2-simplex not on Bel M. Since T was already minimal in this 
respect, there is no v x v 2 vz as assumed at the beginning of this 
paragraph. 

We now show that under the triangulation T we have chosen, 
each 2-simplex of T has at least one edge on Bd M. Suppose v it) 2 v 3 
is a 2-simplex of T with no edge on T; then if v l v 2 v 3 v i is a 3-simplex 
of T, viv 2 v 3 v 4 has at least two edges on Bd M by hypothesis and one 
of the three 2-simplexes v 2 v 3 v 4 , v\v 3 v 4 , v 4 v 2 v 4 has precisely two edges 
on Bd M. This was ruled out in the preceding paragraph. 

4. A FAKE CUBE IS REAL IF IT HAS 

A “NICE” TRIANGULATION 

A natural way of attempting to prove the Poincari conjecture is 
to try to show that a fake cube has a triangulation that can be 
shelled. We say that a triangulation T of an ra-cell can be shelled if 
the n-simplexes of T can be ordered s u s 2 , . . . , s n so that for each 
integer k < n, s k + s* +1 + • • • + s„ is an n-cell. A triangulation 
of a fake cube can be shelled if the 3-simplexes of the triangulation 
can be ordered si, s 2) , s n , such that for each k < n, s* + 

Sfc+i + ' * • + s n is a fake cube. 

Theorem S. Suppose T is a triangulation of a 2-cell D and D' is a 
disk in D which is the sum of elements of T. Then the 2-simplexes 
of T not contained in D' can be ordered s h s 2 , . . . , s n so that for 
each integer k < n, D' + s k + s k+1 + •••+«„ is a disk. 

PBOOF. The theorem is proved by induction on n. Let s be a 
2-simplex of T which does not lie in D’ but which has an edge on 
Bd D. We can either let s be si and shell it or else some edge s' of s 
not on D' is a spanning arc of D. For the latter we consider the two 
disks into which s' separates D and expand D' to be one of them. 
We have thereby reduced n and our inductive process is started 
(see Figure 10). 

Theorem 4 . A fake cube is real if it has a triangulation that can 
be shelled. 



108 


Lectures on Modem Mathematics 



proof. It can be shown inductively that each of s n , (s n _i + s n ) y 
($n -2 + Sn -1 + s n ), . . . , (si + s 2 + • • * + s n ) is a real cube. 

Contrasted with the easiness with which any triangulation of a 
2-cell can be shelled, there is difficulty in shelling a cube. It some¬ 
times comes as a shock to the beginner that even real cubes have 
triangulations which cannot be shelled. Mary E. Rudin described 
[23] a rectilinear triangulation of a real tetrahedron that could not 
be shelled. Earlier unshellable triangulations of topological cubes 
were given in [9] and [27]. We describe two triangulations of a 
topological cube that cannot be shelled. 

Example 1 . Let C be the house with two rooms shown in Figure 
11. The walls are made out of one layer of brick (without mortar) 
so that each brick touches air in two components, and if two bricks 
intersect, the intersection is a face, edge, or corner of each. The 
object is a house with two floors so that to enter the lower floor, one 
goes through a tunnel from the roof and to get to the upper floor one 
goes through a tunnel from below. There is a partition in each 
room from the wall to the tunnel leading to the other room. This 
causes the room to be simply connected. 

We see that C is a topological cube if we regard it as a real cube 
with indentations from above and below. 

Suppose C is triangulated as follows. Order the corners of the 
bricks as follows. First put in any vertex which is a component of 
the intersection of a brick with Bd C. There is one of these on each 




Topology of 3-Manifolds Related to the Poincari Conjecture 109 


inside corner of a room. Next, list the vertices not of this sort but 
which lie on an edge of a brick that is a component of the inter¬ 
section of the brick with Bd C. Finally list the other vertices. The 
bricks are not triangulated as follows. A face of a brick is divided 
with a diagonal that contains the first vertex on the face. Once the 
faces of a brick are triangulated, the brick is triangulated by taking 
the cone from the first vertex of the brick to the triangles on the 
boundary of the brick that do not contain this vertex. 

Note that each tetrahedra of the triangulation intersects the 
boundary of the cube in a disconnected set. Hence the triangula¬ 
tion cannot be shelled. It may be noted that each vertex of the 
triangulation lies on Bd C. 



Tunnel to bottom floor 







110 Lectures on Modern Mathematics 



Figure 12 


Example 2 . Consider the cube with the plugged knotted hole 
shown in Figure 12. The object is topologically a cube since the 
hole is plugged at the upper end with a small cube C\ The result¬ 
ing topological cube C is triangulated so that the edges of C" are 
Z-simplexes of the triangulation. This may be accomplished by 
first subdividing C into small cubes of the same size of C', sub¬ 
dividing each cube into two prisms, and then subdividing each prism 
into three tetrahedra—taking care that the resulting tetrahedra fit 
together right. 

Consider a simple closed curve J made up of a spanning segment 
of C which is an edge of C f and an arc of Bd C. Then J is knotted. 
If we were to start shelling the triangulation of (7, at each stage 
there would be a knotted simple closed curve which lies, except for 
one spanning simplex, on the boundary of the resulting cube. It 
follows that the triangulation of C cannot be shelled for at the last 
stage there could be no such simple closed curve. 

Closely connected with the notion of shelling is that of sequential 
connectedness. A triangulation of a finite w-complex all of whose 






Topology of 8-Manifolds Related to the PoincaH Conjecture 111 


simplexes are faces of an n-simplex is sequentially connected if its 
n-simplexes can be ordered s h s 2 , . . . , s n so that for each i < n, 
$i intersects (s»+i + s i+2 + • * • + s n ) in a connected set. The 
notion of sequential connectedness is stronger than either that of 
connectedness or unicoherence since a projective plane is connected 
and unicoherent. It has no sequentially connected triangulation 
or else it would be simply connected. 

The following theorem comes from [2]. 

Theorem 5 . A fake cube is real if it has a sequentially connected 
triangulation. 

A still more general sort of shelling is frequently used where 
simplexes of all dimensions are considered. A complex C is col¬ 
lapsible if it has a decreasing sequence of subcomplexes C = C\ } 
C 2 , . . . , C n such that C n has only one vertex and for each i < n, 
1 is obtained from Ci by eliminating from C< exactly two sim¬ 
plexes such that one is a major face of the other. If s, s' are the 
simplexes eliminated from C*, with s' a face of s, it follows that s is 
not a face of any other simplex of Ci, and s' is not a face of any 
simplex except s and s' itself. 

Theorem 6 . A fake cube is real if it has a collapsible triangulation. 

proof. The proof is found in works of H. C. Whitehead [32], 
but for completeness we indicate a proof in the following paragraph. 

For each simplex s of the triangulation, let aS(s) be the star of the 
baricenter of s under the second baricentric subdivision of the 
collapsible triangulation. Then these aS(s)'s provide the fake cube 
with a cellular subdivision. By following the way that the original 
triangulation can be collapsed, we find that this cellular subdivision 
can be shelled. Then the proof follows that of Theorem 3. 

It is not immediately obvious whether or not the triangulation 
given by Mary E. Rudin in [23] or that of a house with two rooms or 
that of a cube with a plugged knotted hole is either sequentially 
connected or collapsible. We give an example that is not. 

Example 3 . This example is the same as Example 2 except that 
the knot is more complicated and consists of a double trefoil knot 
instead of a single one (see Figure 13). The proof that the resulting 
triangulation is not collapsible is given elsewhere. By getting more 
complicated holes we can show that for each integer n there is a 


112 Lectures on Modem Mathematics 



Figure IS 


triangulation of a cube whose nth baricentric subdivision is not 
contractible. We now consider another condition on the triangula¬ 
tion of a fake cube that will insure that it is real cube. 

Theorem 7. A fake cube A is a cube if and only if it has a tri¬ 
angulation T such that each vertex of T lies on Bd K and each 
3-simplex of T has at least two edges on Bd K . 

proof. The necessity of the condition follows from the fact that 
a cube has a triangulation with only one 3-simplex. We now show 
that a fake cube is real if it has a triangulation satisfying the pre¬ 
scribed condition. 





Topology of 3-Manifolds Related to the Poincart Conjecture 113 


Following Theorem 2, we suppose with no loss of generality that 
each 2-simplex of T has precisely one or three edges on Bd K. 

We show here that each 1-simplex of T lies on Bd K . Assume 
T has 1-simplexes si 1 , s 2 \ . . . , Sm 1 not on Bd K. The funda¬ 
mental group of Bd K + Ssi 1 is the free group on m generators 

G = {<Zl, Cl 2) . . . > &m} 

Adding a 2-simplex s 2 of T with three edges on Bd K to Bd K + 
does not change the fundamental group of the resulting set since 
Bd s 2 could already be shrunk to a point on Bd K. However, 
adding a 2-simplex $ 2 of T with one edge on Bd K to Bd K + 2 s* 1 
results in a set whose fundamental group is a group with m gener¬ 
ators and one relation w = 1 where w is a two letter word (since s 2 
has precisely two edges among the s* 1 ’s). Accordingly, the funda¬ 
mental group of the 2-skeleton of T is 

G' = {a h a 2 , . . . , a«jwi = w 2 — • • • = w n = 1} 

where each of the Wi s is a two letter word. This group is non¬ 
trivial because the homomorphism /(a t ) = h sends G' onto the 
group H = {b/b 2 = 1}. This contradicts the fact that K is simply 
connected. This contradiction arises from assuming that T has a 
1-simplex not on Bd K . 

Since each 1-simplex of T lies on Bd K, each 2-simplex of T either 
lies on Bd K or separates K. Hence T can be shelled. Theorem 7 
then follows from Theorem 4. 

5. EULER CHARACTERISTICS? 

For a triangulation of a complex, let F t - denote the number of 
z-simplexes in the triangulation. For triangulations of a compact 
connected orientable 2-manifold, the number 

V 2 - V x + F 0 

is important since it determines the topology of the 2-manif®ld. 
We might hope we could tell whether or not a fake cube is real by 
considering a triangulation and computing the Vi s. We find the 
following, however. 

Theorem 8 . For each triangulation T of a compact 3-manifold, 

V 2 = 2 F 3 . 



114 Lectures on Modem Mathematics 


proof. We merely note that each 3-simplex of T has four faces 
and each face belongs to precisely two 3-simplexes. 

Theorem 8 shows us that we do not need to concern ourselves 
with the number of 3-simplexes. This might seem temporarily 
encouraging, for whether or not a triangulated homotopy sphere is 
topologically S z depends not on the number of 3-simplexes, but 
rather on whether or not the 2-skeleton of the triangulation can be 
embedded in E z . We then find the following discouraging result, 
however. 

Theorem 9. For each triangulation T of a compact 3-manifold, 
V x = V z + Vo. 

proof. For each vertex v of T , let F*(i/) be the number of i - 
simplexes of T containing v . It follows from Eulers formula applied 
to the 2-sphere L(v , T) = Bd Star ( v , T ) that 

V z (v) - V 2 (v) + V x (v) = 2 

Summing with respect to all the vertices of T, we find that 
2[F,« - V t (v) + Vx(v)] - 2 Vo 
Each i-simplex contributes to i + 1 of the Vi(v)’s so 
2 Vi(v) = (i + 1 )Vi 

Hence, 

4F S - 3F 2 + 2Fx = 2F 0 

Theorem 9 follows by substituting the equation from Theorem 8 
in the foregoing equation. 

As a final discouraging factor, we have the following theorem. 

Theorem 10 . If M h M 2 are two (possibly topologically different) 
compact 3-manifolds, then there are triangulations T i, T 2 of Mi, M 2 
respectively such that T i, T 2 have precisely the same number of 
simplexes in each dimension. 

proof. I had a rather cumbersome proof of this result for ori- 
entable manifolds using an interesting theorem by Wallace [28] and 
Lickorish [10], but Leslie Glaser pointed out the following easy 
proof. 

Let T[ , T 2 be arbitrary triangulations of M h M 2 . By subdivid¬ 
ing one of them we get triangulations with the same number of 




Topology of 8-Manifolds Related to the Poincari Conjecture 115 


vertices. If the triangulations also have the same number of 
3-simplexes, Theorems 8 and 9 imply that they have the same 
number of simplexes of each dimension. 

For the one of the triangulations with the larger number of 
3-simplexes, subdivide the triangulation by adding a vertex at the 
baricenter of a 3-simplex. This increases the number of vertices 
by one and the number of 3-simplexes by three. To the other, add 
a vertex at the baricenter of a 2-simplex. This increases the number 
of vertices by one and the number of 3-simplexes by four. Hence, 
in a finite number of such subdivisions, we bring and T 2 into 
balance with respect to the number of simplexes in each dimension 
(see Figure 14). 

This result suggests that knowing the Ws does not give enough 
information. Although Ti and T 2 have the same number of 
simplexes in each dimension, they may differ in how these simplexes 
fit together. Something might be accomplished by putting coeffi¬ 
cients on the simplexes and weighting them according to how other 
simplexes hang on to them—but this would be more difficult than 
counting. 

Theorem 11. For each triangulation of a compact 3-manifold, 
V 3 - V 2 + Fi - Vo - 0. 

proof. The result is an immediate consequence of Theorems 8 
and 9. 



Increase V 3 by 3 Increase V 3 by 4 


Figure 14 




116 Lectures on Modem Mathematics 


6. CURVES ENCIRCLED BY TORI 

An affirmative answer to the following problems would provide a 
partial solution to the Poincar6 conjecture. 

Question 1. Is a compact connected 3-manifold M topologically 
S 3 if each tame simple closed curve in M can be isotopically shrunk 
to small size in M? 

A simple closed curve J in M can be isotopically shrunk to small 
size in M if for each e > 0 there is an isotopy h :J X [0, 1] —» M such 
that ho = I (identity), and diameter h\(J) < c. 

Theorem 19 of Section 7 would provide an affirmative answer to 
Question 1 if we have an affirmative answer to the following 
question. 

Question 2. Can each simple closed curve in a homotopy 3-mani¬ 
fold be encircled by a solid torus? 

A solid torus is an object topologically equivalent to the cartesian 
product of a circle S 1 with a disk D. If p 0 £ Int D , the image of 
S 1 X po is called a core of the solid torus. A solid torus T encircles 
a simple closed curve J if J C Int T and J is homotopic in Int T to 
a core of T. In fact, for map f:J —> Int T we say that T encircles 
f(J ) if there is a homeomorphism h : J —> Int T such that T encircles 
h(J) and / is homotopic to h in Int T. 

Not every map of a simple closed curve into E z can be encircled 
by a solid torus but it can if the map is a homeomorphism. 

Theorem 12. Each simple closed curve in E z is encircled by a 
solid torus. 

proof. Let a, b be two points of a simple closed curve J and P 
be a plane separating a from b. Let axb be one arc from a to b in J 
and ayb be the other. There is a polygonal simple closed curve K in 
P separating axb • P from ayb • P in p. Let T be a tubular neigh¬ 
borhood of K so small that it misses J. If the point at infinity is 
added to E 3 , the complement of T is a solid torus which encircles J. 

To get a solid torus encircling J without adding the point at 
infinity, we could consider a large round ball B whose interior con¬ 
tains K + J. Let A be a polygonal arc in P — J from K to Bd B. 
Then the solid torus is B minus a tubular neighborhood of A + K. 



Topology of 3-Manifolds Related to the Poincare Conjecture 117 


Each polygonal simple closed curve J is encircled by a solid torus 
each point of which lies near J. We should not conclude that the 
torus promised by Theorem 12 lies close to the simple closed curve 
it encircles because the simple closed curve was not assumed to be 
polyhedral. Neither should we conclude that Theorem 12 holds in 
3-manifolds other than E z and >S 3 . 

Example 4* There is a simple closed curve J in E z such that each 
solid torus encircling it extends far away. Consider a wild 2-sphere 
S in E z with only a O-dimensional set of wild points such that the 
complement of S is not simply connected (see Figure 15). Let J be 





118 Lectures on Modem Mathematics 

a simple closed curve on S which contains all the wild points of S , 
and K be a polygonal simple closed curve in E z — S which cannot 
be shrunk to a point in E z — S . Then K cannot be shrunk to a 
point in E z — J even though J bounds a disk in S C E z — K. We 
indicate in the following paragraph why each solid torus encircling 
J must intersect K . 

If T were a solid torus in E z — K encircling J , it would have a 
tame core C. This tame core C could be shrunk to a point in 
E z — K so C would not homologically link K . Then K would not 
homologically link C. It follows from Theorem 15 that C is 
unknotted so K could be shrunk to a point in E z — C. UK could 
be shrunk to a point in E z — C, however, it could be shrunk in 
E z — Int T and hence in E z — J . 

Suppose simple closed curve J is oriented as is the longitudinal 
direction around a solid torus T. In the next two theorems, for 
each map / of J into T, we let L(f) be the number of times / circles T 
longitudinally. To find L(/) we could let C be a core of T with 
length oriented as T and r be a retraction of T onto C. Then sub¬ 
divide J into n small positively oriented arcs xix 2 , x 2 x 3f . . . , x n x\ 
and let L(f) be the sum of the oriented distances along C from rf(xi) 
to rf(xi + i) divided by the length of C where it is understood that we 
denote x n+ i by x%. 

Theorem 13 . If/ 1,/2 are maps of J into Bd T such thatL(/i) > 2, 

L(f 2 ) 9^ 0, then either L(/ 2 ) is a multiple of L(/i) or else there is a 
map / of / into f 2 (J) + fi(J) such that 

0 <L(f)< L(fi) 

proof. We suppose L(/ 2 ) > 0. Assume the theorem is false and 
that L{} 2 ) is minimal in the sense that if f h f 3 are two other 
maps showing the theorem false; then L(f 2 ) < L(f 3 ). Note that 
L(fi) < L(f 2 ). 

First we consider the case where fz{J) have a point p in 

common. Let/ 3 be a map of J onto fi(J) + f 2 (J) so that as a point 
moves half way around J } its image under f 3 starts at p and traces 
out f 2 (J), and as the point goes the other half way around J, the 
image under f 3 traces out /i(J) in the negative direction so that 


L{fz) - L(f 2 ) - L(/0 



Topology of 8-Manifolds Related to the Poincart Conjecture 119 


But L(fz) is not a multiple of L(f x ) unless L(f 2 ) is and L(/ 3 ) < L(f 2 ). 
This contradicts the minimality we had assumed for L(f 2 ). 

Finally we consider the case where /i(J) ‘ h(J) = 0. Let 
fi (i = 1, 2) be a homeomorphism of J into fi(J) such that L(/£) > 0. 
Since/; (J) - 0, 

W[) = L(f 2 ) 

Since fi(J) • f 2 (J) — 0, L(/i) is a multiple of L(f' 2 ) and hence a 
multiple of L(f x ). Then L(j[) < L(fi), and under the assumption 
that there is no / with 0 < L(f) < L(f x ) we have 

Hfi) = L(f[) 

Since f 2 (J) • f[(J) = 0, L(f 2 ) is a multiple of L(f[) and hence a 
multiple of L(/i). 

A punctured disk is a set homeomorphic to Do — (Int D x + 
Int D 2 + * • • + Int D n ) where Z) 0 is a disk and D h D 2f . . . , D n 
is a collection of mutually exclusive disks each lying in Int Z> 0 . 

Theorem Suppose / is a map of a punctured disk Z> 0 — 
(Int Z>i + Int Z) 2 + • * * + Int D n ) into a solid torus T such that 
L(f/Bd D 0 ) = 1 and f(Bd D x + Bd D 2 + • * * + Bd D n ) C Bd T. 
Then each open subset of Bd T containing f(Bd D x + Bd D 2 + 

• • • + Bd D n ) also contains a simple closed curve isotopic in T to a 
core of T. 

proof. Suppose each Bd D { (i = I, 2, . . . , n) is ordered 
opposite that of Bd Do. By joining the Bd Dj s with cross cuts in 
the punctured disk, Ave have, 

L(f\Bd D 0 ) = L(f\Bd Dj) + L(f\Bd D 2 ) + ■ • • + L(f\Bd D n ) = 1 

Let n 0 be a positive integer minimal with respect to there being a 
map g of S 1 into f(Bd D x + Bd D 2 + • • • + Bd D n ) such that 
L(g) = n 0 . Either no ~ 1 or else by the foregoing relation there is 
an L(f/Bd D,) which is not a multiple of n 0 . Theorem 13 shows 
that this second alternative leads to the contradiction that n 0 is not 
minimal. Hence no = 1. 

Let U be an open subset of Bd T containing g(S l ) and such that 
each boundary component of U is a simple closed curve. We show 
that U contains a simple closed curve that is isotopic in T to a core 
of T . 




120 Lectures on Modem Mathematics 


If each boundary component of U bounds a disk in T, Theorem 14 
follows since we can get a simple closed curve C in Bd T which 
circles T longitudinally once and then push this C off the disks of 
Bd T — U. If J is a boundary component of U which does not 
bound a disk, J must circle T longitudinally, for if it circles it 
meridianally but not longitudinally, g(S x ) could not circle longi¬ 
tudinally, since g{S x ) misses J . In fact, L(g) is a multiple of the 
number of times J circles longitudinally. Since L(g) = 1, J circles 
T once longitudinally. A simple closed curve J f in U runs “parallel” 
to J and circles T once longitudinally also. Hence J f is isotopic in 
T to a core of T. 

It may be noted that we could not conclude that there is a simple 
closed curve in f(Bd D\ + Bd Z> 2 + * • * + Bd D n ) isotopic to a 
core of T because it might be that n — 2, f(Bd D i) is a simple 
closed curve circling T twice and f(Bd Z) 2 ) is a simple closed curve 
circling T three times with f(Bd D{) • f(Bd D 2 ) being one point. 

Theorem 15. If D is a disk in E z and T is a solid torus in E z 
encircling Bd D , each core of T is unknotted. 

proof. With no loss of generality, we suppose Bd T is poly¬ 
hedral. To see that there is no loss of generality in supposing this 
we argue as follows. Suppose T is the image of a polyhedral torus 
T f under a homeomorphism h. It follows from [6] or [15] that we 
can adjust h slightly to get a homeomorphism h! of T' onto T such 
that h ' is locally piecewise linear mod Bd T\ Reduce T f slightly to 
a polyhedral torus T" so that T" has the same core as T f t T' C 
IntT', and Bd D C h! (Int T"). Then h'(T") is a polyhedral 
torus with the same core as T. 

We suppose that D is locally polyhedral mod Bd D y and from [4] 
we can support this. We suppose furthermore that D * Bd T is the 
sum of a finite number of mutually exclusive simple closed curves. 
Such a desirable condition results from a slight adjustment of either 
the vertices of Bd T or of certain vertices on Bd D . 

If a component of D • Bd T bounds a disk on Bd T f we take the 
inside of such a disk, replace a disk in D with it, and shove it to one 
side of T so as to eliminate a component of D * Bd T . For con¬ 
venience we call the new disk D also. This elimination of com¬ 
ponents of D * Bd T is continued until no component of D • BdT 
bounds a disk on Bd T. 

It follows from Theorem 14 that each simple closed curve in 



Topology of 3-Manifolds Related to the Poincari Conjecture 121 


D • BdT circles T longitudinally exactly once. Let D f be a subdisk 
of D such that D' • Bd T = Bd D\ A core C of T is unknotted 
since it can be homotopically pulled in its own complement over to 
Bd D ; and can then be shrunk to a point in D*. A tame simple 
closed curve is unknotted if it can be shrunk to a point in its own 
complement. 

7. NICE SHRINKING 

If J is a simple closed curve in E 1 2 3 4 5 , then J can be shrunk to a point 
so that at each stage of the shrinking, except the last, J has been 
shrunk to a simple closed curve. A natural way to get this shrink¬ 
ing is to pick a point p of E 3 and decide that at time t the point xoiJ 
has been moved to the point of the segment xp that divides it in the 
ratio t to (1 — t). Note that if J is a polygonal simple closed 
curve, then at each stage of the homotopy except the last, J is a 
polygonal simple closed curve. It is not known whether or not this 
property holds in an arbitrary simply connected compact 3-mani- 
fold, but Theorem 17 shows that unless the Poincar6 conjecture is 
true, it does not. 

In this section we lean heavily on the following result proved in 
[5]. McMillan [13] has an even stronger version of this result. 

Theorem 16. A compact, connected 3-manifold M is topologically 
S 3 if each tame simple closed curve in M lies in a topological cube 
in M. 

proof. Although we do not repeat the proof given in detail in 
[5], we list the steps used in proving the theorem. 

1. Let T be a triangulation of M. We suppose each polygonal 
simple closed curve in this triangulation lies in a topological cube 
in M . 

2. Show that step 1 implies that each polygonal simple closed 
curve lies on the interior of a polyhedral cube. 

3. Show that step 2 implies that the 1-skeleton of T lies on the 
interior of a polyhedral cube. 

4. Show that step 3 implies that the 2-skeleton of T lies in a 
polyhedral punctured cube. 

5. Fill in the holes of the punctured cube of step 4 to show that M 
is topologically S 3 . 




122 Lectures on Modem Mathematics 


The following result is new and its proof is included. 

Theorem 17. A compact connected 3-manifold M is topologically 
if for each tame simple closed curve J in M there is a homotopy 
ft:J-^M(0 < t < 1) such that f 0 is the identity, each ft(J) is 
encircled by a solid torus in and fi(J) lies on the interior of a 
topological cube in M. 

proof. We prove Theorem 17 by showing that each simple 
closed curve in M lies on the interior of a topological cube in M. 
Then Theorem 17 follows from Theorem 16. 

Let X be the set of all points t of [0, 1] such that f t (J) lies on the 
interior of a cube in M. The proof is finished if we show that 
0 £- X. It follows easily that X is an open subset of [0, 1] that con¬ 
tains 1. It follows that 0 £ X when we show that X is closed. 
Toward this end we let to be a point of the closure of X and show 
that to £= X. 

Let To be a solid torus in M encircling ft 0 (J). We select such a 
To whose boundary is polyhedral. That we can do this follows 
from an argument used in the proof of Theorem 15. 

If there is a cube C in M whose interior contains a simple closed 
curve on Bd To that circles T 0 once longitudinally, there is a cube C f 
whose interior contains a core K of T 0 . Expanding a small tubular 
neighborhood of K to containwe see how to adjust C' so that 
its interior contains Hence we finish the proof of Theorem 17 

by showing that there is a cube C in M such that Int C contains a 
simple closed curve on Bd To that circles T 0 once longitudinally. 

Let t x be a point of X so near t 0 that T 0 encircles ft x {J) and C be a 
topological cube in M whose interior contains f tl (J ). Let D be a 
polyhedral disk and h :J —> Bd D be a homeomorphism. Let 
g:D-+ Int C be a map such that g = on Bd D and g is 

piecewise linear in a neighborhood of g~ l [Bd T 0 • g(D)]. We sup¬ 
pose that g is in a general position with respect to Bd T 0 in the sense 
that g~ x [Bd T 0 • g(D)] is the sum of a finite collection of mutually 
exclusive simple closed curves. Let E be the component of D — 
g~ l [Bd T o • g(D)] containing Bd D . Then E is a punctured disk, 
and it follows from Theorem 14 that there is a simple closed curve 
in Bd T 0 • Int C that is isotopic in T 0 to a core of T 0 . 

We have the following as a corollary to Theorem 17. 

Theorem 18. A compact connected 3-manifold M is topologically 
S 3 if each tame simple closed curve J in M can be isotopically shrunk 



Topology of 3-Manifolds Related to the Poincart Conjecture 123 


to small size in M so that the image of J at each stage of the isotopy 
is a tame simple closed curve. 

Although it is not known that each tame simple closed curve J in a 
homotopy 3-sphere can be shrunk to a point so that at each inter¬ 
mediate stage the image is a simple closed curve, it is known that J 
can be shrunk so that at each intermediate stage the image of J is 
either a simple closed curve or a figure eight. 

Theorem 19. If J is a polygonal simple closed curve which can be 
shrunk to a point in a triangulated 3-manifold M , there is a homo¬ 
topy ft'.J —» M such that/o is the identity, f\ is a constant map, and 
each/*(0 < t < 1) is a piecewise linear map which is a homeomorph- 
ism except at a finite number of V s and has precisely two singularities 
at each of these. 

proof. Let D be a 2-simplex and g :D —> M be a piecewise linear 
map such that g : Bd D —> J is a homeomorphism and g is a general 
position map in the sense that there is a triangulation T of D such 
that g is a piecewise linear homeomorphism on each simplex of T; 
and if s', s" are two simplexes of T, 

dimension • £($") — g(s' • s")] < dimension s' 

+ dimension s" — 3 

Since T can be shelled (Theorem 3), there is an ordering s h s 2 , 
. . . , s n of the 2-simplexes of T such that for each i < n, $ t * + 
+ * • • + s n is a disk D*. Note that g is a homeomorphism on 
each Bd Di and g~ x is locally a homeomorphism modulo a finite set 
of points X{ on g(si + Bd 1 ) where each point of Xi has pre¬ 
cisely two preimages under g. 

The homotopy /*:/—» g(D) is described as follows. 

fo = I 

filn(J) = g(Bd Di+ 1 ) 

ft = fi!n on ff}lg(Bd D i+1 • Bd D <+2 ) 

ft - gh t g~~ l fiin onfJjlg(Bd D i+ i - Bd D i+2 ) 

for i/n < t < (t + l)/n where ht is a piecewise linear isotopy that 
shoves Bd 1 — Bd D i+2 straight to Bd D i+2 — Bd Z);+i through 
Si +1 in such a way that for no t does gh t (Bd Di +2 — Bd D i+ 1 ) con- 



124 Lectures on Modern Mathematics 


tain more than one point of X*- +1 and for each point x of • Xi +i 
there is exactly one h t with x £ gh t (Bd 1 — Bd Z)^). For the 
values of t for which gh t (Bd D i+ 1 — Bd D i+ 2 ) misses f t is a 

homeomorphism and for the finite number of others, ft(J) is a 
figure eight. 


8. WHY? 

Why are mathematicians interested in finding a solution to the 
Poincar6 conjecture? For one thing, a solution would show the 
power of the algebraic technique. For compact 3-manifolds, it 
would show at least in one case that the algebraic properties deter¬ 
mine the topology. It would also provide us with a handy charac¬ 
terization of >S 3 . 

Aside from this, there is a historical interest in the Poincar6 
conjecture. Poincar6 announced in [21] that two compact com¬ 
binatorial n-manifolds were topologically equivalent if they had the 
same homological characters, but he disproved this in [22] by con¬ 
structing a 3-manifold which is not simply connected, even though 
all its 0, 1, and 2 cycles bounded. The homology spheres given in 
Section 2 were such examples. Alexander showed [1] that there 
were combinatorially different toroidal 3-manifolds which had the 
same homotopy properties. It follows as a result of Moise’s proof 
of the Hauptvermutung [16] that these homotopically equivalent 
toroidal manifolds were topologically different. We might still 
hope, however, that if the first homotopy group is trivial, this would 
imply that the manifolds were topologically equivalent also. 
J. H. C. Whitehead gave a faulty proof of the Poincare conjecture 
in [30] but pointed out his error shortly thereafter in [31]. 

There were two competing solutions for the Poincar6 conjecture 
being rumored at the Institute for Advanced Study in Princeton in 
1957. Although both proofs fell flat, the rumor was so strong for 
awhile that the author considered withdrawing his partial solution 
[5] which had already been accepted by the Annals. 

A “proof” of over 100 pages appeared [12] in 1958. The last 
half of the paper gives the true result that a compact connected 
triangulated 3-manifold is topologically S 3 if its 2-skeleton can be 
embedded in E z . The first half asserts that for any triangulation of 
a homotopy 3-sphere, the 2-skeleton can be embedded in ZJ 3 . The 
plan of the proof is to embed the 2-simplexes one at a time, taking 



Topology of 3-Manifolds Related to the Poincari Conjecture 


care that the images fan out around edges and vertices in the way 
the original simplexes do. Most of the first half is clear where it is 
shown how to embed the first few 2-simplexes. It is then claimed 
that if an obstruction arises, it would have been possible to start 
over and by a different approach embed more of the 2-simplexes. 
The argument here is not clear. 

It was shown by Stallings, Zeeman, Smale, and Wallace that in 
high dimensions certain other versions of the Poincar6 conjecture 
hold [25, 33, 34, 24, 29]. Theorem 20 was shown. 

Theorem 20, If M n (n > 5) is a compact combinatorial n-mani- 
fold that has the same connectedness properties as S n , then M n is 
topologically equivalent to S n , 

proof. Although we do not include details, the proof may be 
given with the following steps. 

1. Using an engulfing lemma [26], show that any (n — 3)-dimen- 
sional polyhedron in M n lies on the interior of a polyhedral n-cell 
in M n . 

2. Put the (n - 3)-skeleton K n ~ z of the triangulation T of M n on 
the interior of a topological n-cell C\. 

3. Get the dual of K n ~ z by taking the first baricentric subdivision 
of T and letting the dual H 2 be the sum of all simplexes of this sub¬ 
division that miss K n ~ z . It is to be noted that H 2 is of dimension 2. 

4. If 2 < n — 3 (and this occurs only when n > 5), there is a 
topological n-cell C 2 whose interior covers H 2 . 

5. Swell up Ci and C<i until the sum of their interiors cover M n . 

6. It follows from the Schoenflies theorem proved by Brown [7] 
that any compact n-manifold is topologically S n if it is the sum of 
two open n-cells. 

One of the reasons we have been able to prove so many interesting 
things about 3-manifolds is that we know so much about S 2 . Not 
knowing more about S z is a bottleneck in learning more about 
higher dimensional manifolds. The following theorem gives one 
reason for wanting to get an answer to the Poiucar6 conjecture. 

Theorem 21, Any triangulated 4-manifold is a combinatorial 
4-manifold if the Poincar6 conjecture holds. 

This result follows from the following theorem. 




126 Lectures on Modem Mathematics 


Theorem 22. If T is a triangulation of a 4-manifold M 4 and v is a 
vertex of T, then L(v, T) = Bd Star ( v , t) is a simply connected 
3-manifold. 

proof. Note that L(v, T) consists of the sum of all 3-simplexes 
s 3 of T such that the join of v and s 3 is a 4-simplex of T . Let T be 
the set of simplexes of T in L(v, T ). We note that L(v, T) is simply 
connected. If it were not, some simple closed curve near v in 
M 4 — v could not be shrunk to a point on a small subset of M 4 — v. 

To show that L(v , T) is a manifold, it suffices to show that the 
link L(v' } T') of each vertex v' of T is a 2-sphere. This link L(y', T) 
is the sum of all 2-simplexes $ 2 of T' such that the join of s 2 and v' is 
a 3-simplex of T', We note that L(v ', T f ) is a 2-complex. Also, 
L(v', T') is connected or else v' locally separates L(v' } T f ) and vv' 
locally separates M i . Each 1-simplex s 1 of T f in L(v' } T f ) is a face 
of precisely two 2-simplexes of T f in L(v ', T f ) or else the invariance 
of domain is violated in M 4 near the join of s 1 and v . No point s of 
L(v', T r ) locally separates L(vT f ) or else y's locally separates 
L(v, T) and tm’s locally separates M 4 . Hence L(v\ T') is a con¬ 
nected 2-manifold. 

To show that the connected 2-manifold L(v' y T) is a 2-sphere, it 
suffices to show that each simple closed curve in L(v\ T') homo¬ 
logically bounds in L(v' y T r ). If such a simple closed curve Ji 
did not bound, there would be a small simple closed curve J\ in 
L(y, T) — v f near v' such that J 2 does not bound on a small subset 
of L(v y T) — v'. Then there would be a small simple closed curve 
Jz on the join of v and J % near the center of vv' such that / 3 does not 
bound on a small set in M 4 - vv'. This violates the fact that the 
first homology of a 4-cell remains trivial even on the deletion of a 
closed set homeomorphic to a subset of an arc. 

REFERENCES 

1. J. W. Alexander, “Note on two three-dimensional manifolds with the same 
group,” Trans . Amer. Math . Soc , 20 (1919) 339-342. 

2. R. H. Bing, “A characterization of 3-space by partitioning,” Trans . Amer. 
Math . Soc. 70 (1951) 15-27. 

3. -, “Locally tame sets are tame,” Ann. of Math. 59 (1954) 145-158. 

4. -, “Approximating surfaces with polyhedral ones,” Ann. of Math. 66 

(1957) 454-483. 

5. -, “Necessary and sufficient conditions that a 3-manifold be S % ,” 

Ann. of Math. 68 (1958) 17-37. 



Topology of 3-Manifolds Related to the Poincart Conjecture 137 


6. -, “An alternative proof that 3-manifolds can be triangulated,” 

Ann. of Math. 69 (1959) 37-65. 

7. M. Brown, “A proof of the generalized Schoenflies theorem,” Bull. Amer • 
Math. Soc. 66 (1960) 74-76. 

8. R. H. Fox, Construction of Simply Connected 8-Manifolds, Topology of 
8-Manifolds and Related Topics, Englewood Cliffs, N.J. (1962) Prentice- 
Hall, 213-216. 

9. F. Frankl, “Zur Topologie des dreidimensionalen Raumes,” Monatsh. 
Math. Phys. 38 (1931) 357-364. 

10. J. Hempel, Construction of Orientahle 3-Manifolds, Topology of 3-Manifolds 
and Related Topics , Englewood Cliffs, N.J. (1962) Prentice-Hall, 207-212. 

11. -, “A simply connected 3-manifold is S* if it is the sum of a solid 

torus and the complement of a torus knot,” Bull. Amer. Math. Soc. 16 
(1964) 154-158. 

12. K. Koseki, “Poincaresche Vermutung in Topologie,” Math. J. Okayama 
Univ. 8 (1958) 1-106. 

13. D. R. McMillan, Jr., “On homologically trivial 3-manifolds,” Trans. Amer. 
Math. Soc. 98 (1961) 350-367. 

14. J. Milnor, “A unique decomposition of 3-manifolds,” Amer. J. Math. 84 
(1962) 1-7. 

15. E. E. Moise, “Affine structures in 3-manifolds, IV. Piecewise linear 
approximations of homeomorphisms,” Ann. of Math. 66 (1952) 215-222. 

16. - 1 < ‘Affine structures in 3-manifolds, V. The triangulation theorem 

and Hauptvermutung,” Ann. of Math. 66 (1952) 96-114. 

17. C. D. Papakyriakopoulos, “On solid tori,” Proc. London Math. Soc. (3) 7 
(1957) 281-299. 

18. -, “On Dehn’s lemma and the asphericity of knots,” Ann. of Math. 

66 (1957) 1-26. 

19. -, “Some problems on 3-dimensional manifolds,” Bull. Amer. Math. 

Soc . 64 (1958) 311-335. 

20. -, “Reduction of the Poincare conjecture to other conjectures,” 

Bull. Amer. Math. Soc. 68 (1962) 360-366. 

21. H. Poincare, “Second complement a Y analysis situs,” Proc. London Math. 
Soc. 32 (1900) 277-308. 

22. -, “Cinqui&me complement a l 7 analysis situs, 77 Rend. Circ. Mat. 

Palermo 18 (1904) 45-110. 

23. Mary E. Rudin, “An unshellable triangulation of a tetrahedron, 77 Bull. 
Amer. Math. Soc. 64 (1958) 90-91. 

24. S. Smale, “Generalized Poincare’s conjecture in dimensions greater than 
four,” Ann. of Math. (2) 74 (1961) 391-406. 

25. J. R. Stallings, “Polyhedral homotopy spheres,” Bull. Amer. Math. Soc. 
66 (1960) 485-488. 

26. -, “The piecewise-linear structure of euclidean space,” Proc. Cam¬ 

bridge Philos. Soc. 68 (1962) 481-488. 

27. E. R. Van Kampen, Remark on the address of S. S. Cairns, Michigan Lec¬ 
tures in Topology (1941). 

28. A. H. Wallace, “Modifications and cobounding manifolds,” Canad. J. 
Math. 12 (1960) 503-528. 




128 Lectures on Modem Mathematics 


29. A. H. Wallace, “Modifications and cobounding manifolds/ 1 II, J. Math 
Mech. 10 (1961) 773-809. 

30. J. H. C. Whitehead, “Certain theorems about three-dimensional mani¬ 
folds,” I, Quart. J. Math. Oxford 5 (1934) 308-320. 

31. -, “Three-dimensional manifolds (corrigendum),” Quart. J. Math 

Oxford 6 (1935) 80. 

32. -> “Simplicial spaces, nuclei, and m-groups,” Proc. London Math. Soc. 

46 (1939) 243-327. 

33. E. C. Zeeman, “The generalized Poincar6 conjecture,” Bull. Amer. Math. 
Soc. 67 (1961) 270. 

34. -■, The PoincarS conjecture for n > 5, Topology of 8-manifolds and 

related topics , Englewood Cliffs, N.J. (1962) Prentice-Hall, 198-204. 



Partial Differential 

Equations l problems and 

uniformization in Cauchy’s problem 

Lars G&rding 

Lecture I Some Problems in the Theory of 
Partial Differential Equations 
1. INTRODUCTION 

The theory of general partial differential operators has made big 
strides in the last ten or fifteen years. It has been treated in more 
or less comprehensive expositions [1, 2] and has now reached the 
monograph stage [3]. This lecture describes some recent develop¬ 
ments for elliptic and hyperbolic equations and states some problems 
that they pose. There is also a concluding section on Frobenius' 
theorem and invariant distributions. 

Elliptic Operators. Let X be an infinitely differentiable manifold of 
dimension n. Let 

a(x , D) = ^ a a (x)D a 
M 

where 

Dk= \^U D “ = Dl “' ' ' ' ZV "’ |«| =«!+•••+«„ 

be a differential operator of order m with smooth complex coeffi- 




ISO Lectures on Modern Mathematics 


cients defined in an open bounded part 7 of I with a smooth 
boundary's. Let 

g(x, D) = ^ a a (x)D 

[a| =m 

be the principal part of a. Let B h ... y Bi be similar operators of 
orders m x , . . . , m* defined on S. The classical linear boundary 
problems for Laplace’s operator are of the form 

(1.1) au — v in V , BjU — Wj on S 

The general theory of such problems (see, e.g., [4] and its bibliog¬ 
raphy) is based on a principle of localization due to Lopatinski [5], 
which can be formulated loosely as follows: the problem (1.1) is 
reasonable provided it is sufficiently reasonable at every point 
y £ S when we freeze the operators a and Bj at y [replace a(x y D ) by 
a(y , D) and similarly for B] and replace £ by its tangent plane at y 
and V by a half-space that the plane bounds. Having done this we 
can also get rid of all the variables x except one by performing a 
Fourier transform in the tangent plane. Thus (1.1) is reduced to a 
one-dimensional problem: 

(1.2) a(y , p + ND t )u(t) = v(t), Bj(y , p + ND t )u{ 0) = w j 

where TV = N(y) is the inferior normal of S at y> p = (p x , . . . , p n ) 
is real, t > 0 is a real variable, and D t ~ — id/dt. Further, some 
experience with Fourier transforms will tell us that it is sufficient to 
consider (1.2) when p is large modulo N. That (1.2) is reasonable 
means that it has a unique solution which is small for large t pro¬ 
vided 0. In general, g(x y N ) 9^ 0, so that we can factorize 


a(x, p + ND t ) » g(x, N ) [] (fit — Tk) 

i 

with 


Tk = T k (x , p, N ) 

When v = 0, any solution of (1.2) is a linear combination of expo¬ 
nential solutions 

e XTkt , r k = T k (y) 


which fall off for large t if and only if /mr* > 0. If (1.2) is to 
determine u t the number of such r k ought to be the same as the 
number of linearly independent B } s. Trouble can be expected at 
points y where g(y y N) — 0 since some r* becomes infinite there. 



Partial Differential Equations; Problems 1S1 


We avoid this by requiring that 

(1.3) g(x, p) 7 * 0 when p ^ 0, real, and xGF + S 

This is the well-known condition of ellipticity; it implies that all 
solutions of au ~ 0 in V are smooth. Returning to the t*, we 
observe that, because of (1.3), 

a (z> P) ^ x 
9(x, p) 

as p —> «>. This means that for large real p, the zeros of 
a (Xf p + tN) = 0 and g(x } p + rN) = 0 are close of order o(|p|) and 
that the order of magnitude of the latter ones, which are homogene¬ 
ous of order 1 in p, is |p|. Hence we can replace our condition on 
the zeros r by the following. 

(1.4) g(x, p + tN ) = 0 has precisely l zeros with positive 

imaginary parts for x £ S and p real and j* o mod N 

and a loosely formulated condition: 

(1.5) the operators B are linearly independent 

When n > 2, we can pass from p to — p continuously, keeping p 
large modulo N. Hence (1.4) means that, necessarily, 21 = m in 
this case. The simplest boundary problem is Dirichlet’s problem 
where B h . . . , Bi are the first l normal derivatives. The general 
theory associated with (1.3), (1.4), and a quantitive form of (1.5) 
gives as a result that the mapping 

(1.6) u—>Tu = au , Biu, . . . , B t u 

when defined on suitable function spaces has a null-space of finite 
dimension v and a range of finite co-dimension p. Increasing 
smoothness of Tu implies increasing smoothness of u . This theory 
covers, for example, the problem of the oblique derivative (a 
Laplace’s operator, B = B i, an arbitrary real first-order derivative) 
and has given a very interesting perspective on the classical bound¬ 
ary problems. But it also raises a number of problems. 

In the first place, the index v — p becomes invariant under con¬ 
tinuous deformations of the operator T. For the problem of the 
oblique derivative in two dimensions, it was discovered long ago by 
Fritz Noether [6] that the index equals twice the ordinary winding 




1S2 Lectures on Modem Mathematics 


number of the vector field B relative to the boundary S . Are there 
similar interpolations in more general cases? More generally, what 
is the connection between the index and the deformation classes? 
This question also has wider implications. The question of the 
index for an elliptic system on a manifold has recently begun to 
attract attention [7, 8]. Second, it is essential for the principle of 
localization that S be smooth. Can it be extended to cases when S 
has corners of a simple type? In that situation we can expect a 
complication: increased smoothness of Tu will not make u smooth. 
Consistency conditions at the corners are also required. Which are 
they and for which degrees of smoothness do they enter? How is 
the index influenced by the consistency conditions? For discon¬ 
tinuous boundary operators B , these problems have been studied in 
a very interesting paper by Peetre [9]. 

2. SOBOLEV SPACES AND BOUNDARY PROBLEMS 

Let W p l (V) be the Banach space of complex functions whose 
derivatives of order < l are p-summable over F. One of the imbed¬ 
ding theorems of Sobolev [10] says that the topological inclusion 

(2.1) W P \V) C W, m (7) 

holds, provided, for example, that 

A n , n 

0 <- l <- m 

V <1 

and that S can be reached by the vertex of a cone of fixed shape 
contained in F. We say that y £ S is regular if it belongs to the 
boundary of an open part of F which has the property (2.1). Are 
there better geometrical criteria than the cone property? Are there 
necessary and sufficient conditions for regularity resembling the 
Wiener criterion for regularity in Dirichlet’s problem? The results 
of Campanato [11] are a beginning. 

The last question leads us back to elliptic boundary problems. 
The form given to them previously is very comprehensive, but this 
is bought with the assumption of smoothness of the boundary. In 
simple cases, all smoothness can be thrown away. Let H l (V) be 
the closure in W^ l {V) of smooth functions with compact supports. 
If F is bounded, DirichlePs problem for a real elliptic a(x f D) of 



Partial Differential Equations; Problems 183 


order 21 can be put in the form 

an — 0, u — w £ H\V) 

where w £ WV(F) is given. Not much is known about how the 
smoothness of u in V + S is influenced by that of S and w> but a 
start has been made by Kondratev [12], who works with a general 
notion of capacity. 

3. HYPERBOLIC EQUATIONS 

For simplicity, we shall now assume that X = R n . A distribu¬ 
tion E(x , y) such that 

a(x, D)jE(x, y)f(y) dy = f(x) 

for smooth functions / with compact supports is said to be an ele¬ 
mentary solution of a with pole at y. To begin with, let a = a(D) 
have constant coefficients. We say that a is hyperbolic if it has an 
elementary solution with support in a cone issuing from the pole, 
which is proper in the sense that it is contained in some half-space. 
Let (x — y)N > 0 be one of them. It can be shown (see [3]) that 
an equivalent algebraic condition is the following: g(N) 0 and the 
zeros of 

a(p + tN ) = 0 

have bounded imaginary parts when p is real. Observe that the 
pairs a, ±N have this property at the same time. In particular, 
the zeros of 

m 

(3.1) g(p + tN) = g(N ) [I t r + **(?)] 

1 

are real for p real; if they are different for p t 6 0 mod N , which we 
shall assume, the inverse implication holds and a is said to be 
strongly hyperbolic (with respect to N). 

Example f. The wave o perator c~ 2 D i 2 — Z> 2 2 — • • • — D n 2 , 
xN = Xi, Xi ,2 = c -1 pi ± Vz? 2 2 + • • • + Pn 2 , (c > 0). The char- 
acteristic conoid T of a is the real algebraic surface g(p) = 0. This 
surface consists of m different sheets IT, . . . , T m given by \i(p) — 
0, . . . , \ m (p) = 0. Every straight line with direction N not 
passing through the origin meets T in m real separate points. The 



134 Lectures on Modem Mathematics 


complement of r has m + 1 components (if n > 2). The com¬ 
ponent containing N is called the characteristic cone f = f(g, N) of 
a. It is convex, and — AT determines the opposite cone. 

Example 2. In th e example above, f is the cone c“~Vi — 

Vp 2 2 + • * • + pn > 0. The conoid r has a dual object, the 
bicharacteristic conoid C in x-space, given in parametric form by 


* = ±9p(p), r 

If p runs through a sheet of r, x runs through a corresponding sheet 
of C. The common part of C and xN > 0 is contained in and 
bounds the bicharacteristic cone 

& — all x with x f > 0 


Example 8 . In the foregoing example, C is the conoid 
c 2 x i 2 - x 2 2 — • • * — x n 2 = 0 


and C the cone 


cx i 


- Vx 2 2 + • * • + x n 2 > 0 


In the general case, a(D) has an elementary solution E(x — y) 
such that 

support E C (J 


The singularities of E are on the bicharacteristic conoid C. Outside 
C y E is regular and, in fact, analytic. 

When a has variable coefficients and is strongly hyperbolic, we 
have characteristic conoids T(y) and cones f(y) in p-space deter¬ 
mined by g(y, p) and a continuous choice of 0 N(y) £ f (y). 
The bicharacteristic conoid C(y) with vertex at y is no longer gener¬ 
ated by straight lines. It is the projection on x-space of all real 
bicharacteristic bands 


dx = g p (x, p) dt , dp = -g x (x f p) dt 

*[0, y ] = y, p[ 0, q] = q 

with g(y f q) — 0. Restricted locally to a half-space through y , it 
bounds the bicharacteristic cone 6(y), whose tangent cone at y is 



Partial Differential Equations; Problems 185 


characterized by (x — y)t(y) > 0. The operator a(x, D) has an 
elementary solution E(x, y) with pole at y and support in &(y) 
regular outside C(y) (see [3]), Cauchy’s problem 

au = i), u = w m times on S 

(meaning that u-w vanishes on S together with all derivatives of 
order <m) has a unique solution for surfaces S:s(x) = 0 which are 
space-like in the sense that 

grad $(x) G ± f (x) 

for all x G S (see, e.g,, [3]). The value of the solution u at x 
depends only on v and w in the compact region common to one side 
of S and one of the bicharacteristic cones issuing from y . 


x 



Progressive Waves . The two-dimensional wave equation 

c 2 u xx — u t t = 0 

has the solution u = h{x + ct), h arbitrary, called progressive waves. 
Considered as functions of x, these solutions move (are translated) 
in opposite directions with velocity c when the time t changes. 
Generalized to a nonlinear situation, they occupy a prominent place 
in Riemann’s work [14] on wave propagation. 

Much of the recent progress in the theory of hyperbolic operators 
has been achieved by the construction of approximate progressive 
waves in the general case (see, e.g., [13, 15, and 16]). We shall now 
outline this construction. Let <r(x) be a real regular function with 
o ’x “ grad ^ 0 and let h be a distribution in one variable with 
successive derivatives and integrals 

k 0 = h , ti k+1 — h k ; k = 0, ± 1, ± 2, . . . 

* 

We say that a distribution v is regular of order >k with respect to 




186 Lectures on Modem Mathematics 


the distribution 


(h o o)(x) — h(<r(x)) 


if there exist regular v *, t>* + i, • • • such that the differentiability of 


»(*) - 2, v M) h i ° *( x ) 


tends to infinity with Z. Notice that 


where 


° (*• ji) ( F ^ h ° ° = ^ [ a * ( x » £:) ^(*)] hk ~ m ° ff (*) 

o 

ere 

order a,k (x , — J < A: 

\ to/ 


In particular, if 


a (^s)-'(*’s) + »'(*-s) + 


with g and g' homogeneous of orders m and m — 1 respectively, then 

flo = g(x, o x ) 
d 1 

®i = 9p( x > Ox) ~z~ g 9pp (*» o x )o X x g (x, o x ) 


9p( x > P) = grad p 0(z, p), = 


dp; 3p* 


and natural summation conventions. Hence if 


»(*, O = 0, oi 


(*■&) 


F(x) = 0 


(3.3) u(x) = F(x)h © o-(:c) 

then aw is regular of order >2 — m> that is, two units more than 
expected. Observe that the equations (3.2) do not involve h and 
that they have plenty of solutions. More generally, it is easy to 




Partial Differential Equations; Problems 


see that we can choose 

i 

(3.4) u(x) = £ Fj{x)hj ° <t(x) 

0 

so that au is regular of order >1 + 2 — m. In the analytic case we 
can even let Z —> « and obtain solutions of au = 0 depending among 
other things on an arbitrary distribution h. 

It is clearly reasonable to consider (3.3) and (3.4) as approximate 
progressive waves with the wave fronts 

<r(x) = constant 

Notice that if \ k (x, p) is defined by factoring g(x, p + tN) [see 
(3.1)], then (3.2) means that 

Xt(z, <r x ) = 0, <*i (x, F{x) = 0 

for one k and that a and F can be described on any space-like surface 
S. In particular, there are m different wave fronts reducing to a 
given one in S. An elementary solution having a pole at y € S and 
vanishing on one side of S is an example of this solution: it can be 
looked upon as a sum of m progressive waves starting from a point in 
S. Approximate progressive waves can be used to give existence 
proofs for elementary solutions and for Cauchy's problem. So far, 
proofs of this kind require more smoothness of the coefficients of a 
than the proofs using energy methods, but they give more informa¬ 
tion about the irregularities. 

Most physical boundary problems for hyperbolic equations are of 
mixed type in the sense that they involve reflecting barriers. A lot 
of work has been done with these problems (see, e.g., [17] and [18] 
and the literature quoted there), but no systematic theory is in 
sight. It should be noted that the theory of reflection of approxi¬ 
mate progressive waves is an interesting but at present neglected 
field of study. 

Wave propagation also has an ergodic aspect: when there is a 
complicated enough mixture of waves, the theory has to be restricted 
to asymptotic distributions of energy for large times. In some 
simple cases there is equidistribution in space for the high frequen¬ 
cies (see [19]); recently the energy distribution outside a reflecting 
obstacle has been analyzed by Lax, Morawetz, and Phillips [20] with 
very interesting results. 



1S8 Lectures on Modem Mathematics 


4. FROBENIUS’ THEOREM 

Let M be a C"-module of homogeneous real first-order linear 
differential operators 


n 



which has the property that 

A, B E M => [A, B] = AB - BA £ M 

and let M' be all distributions / = f'(x) for which Af = 0 for every 
A m. M. Let r(x) be the linear dimension of M at the point x. 
Close to a point x where r is constant we have the classical theorem 
of Frobenius [21]: there exist coordinates yi, ... ,y n such that M 
is spanned by d/dy h ... 6/dy r . In particular, M' is the set of 
distributions depending only on y r +i, ■ ■ . , y n . 

The assumption that r(x ) is constant is violated in many interest¬ 
ing cases, for example, when the elements of M are the infinitesimal 
generators of a connected Lie group G acting on CT{X). In that 
case, M' is the set of distributions invariant under G. When G is 
the Lorentz group and X is a product of g copies of R* where g is the 
discussion of G, M' has some interest in connection with quantum 
field theory. It is known for g = 1 ([22]), and has been investigated 
for g = 2 ([23]). But these are only partial results, and we would 
like to have an answer to the following question. What is the local 
structure of M and M' in a neighborhood of a point x where r is 
discontinuous? If the coefficients of all A £ M are real analytic, 
a complete answer is perhaps possible. 


Lecture II Uniformization in Cauchy’s Problem 
5. INTRODUCTION 

Let I be a complex analytic nonsingular variety of dimension l 
with coordinates x u ... , x h and let 

a {*■£)- X 

\<x\ <m 




Partial Differential Equations; Problems 1S9 


be a linear differential operator on X of order m with holomorphic 
(regular analytic) coefficients. Let 

»(*=)-5>(s)‘ 

|aj =m 

be its principal part. Let S: s(x ) = 0 be a nonsingular analytical 
hypersurface in X and consider Cauchy’s problem with holo¬ 
morphic data v and w: 

, v a (x. — J u(x) = v(x) in X 

(5.1) \ *dzj 

u{x) — w(x) =0 m times on S 

According to theorem of Cauchy-Kowalewski, this problem has a 
unique holomorphic solution near a point which is non¬ 

characteristic in the sense that 


g(x, Sx) 5 * 0 


where s x = grad In this connection we may ask ourselves the 
following question: what happens to the solution near the character¬ 
istic point of S? Take for instance 



dX\ 


Then 


S:x2 — x\ v = 0 
T:x i — x 2 = 0 


is a characteristic subvariety of S of codimension 1. It is part of a 
characteristic hyperplane 

K:x 2 = 0 

which touches S along T. More precisely, the order of contact is p. 
The solution of Cauchy’s problem is 

u(x) = w(y h x 2 > . . .) + f™ v(s,x 2y . . -) ds, y\^^/x 2 

It is not holomorphic, but it is ramified p times around K. The 



140 Lectures on Modern Mathematics 


substitution 
uniformizes u(x). 


Xu x 2f 


' Xu t p , . . . 



K 


In [4], J. Leray proved that this situation is general: the solution 
is ramified around a characteristic subvariety tangent to the initial 
surface, and it can be uniformixed explicitly. 

In the following, which is a report on a long paper by G&rding, 
Leray, and Kotake [2], all this will be proved in a simple fashion. 
Actually, [2] deals with systems, but we shall restrict ourselves to a 
single operator. Following [2] we shall introduce an extra variable 
£ in (5.1), replacing S by S(£) :s(x) = £. The problem is then 

(5.2) a (*’ £;) x ) = x ) in x 

u(%, x) — w(£, x) = 0 m times on S(£) 

This problem is a special case of (5.1) in (l + 1) variables, and the 
Cauchy-Kowalewski theorem still holds. The uniformization of 
(5.2) is very simple. Let £(t y x) be the solution of Hamilton- 
Jacobi's equation 

£t + g(x, £x) = 0, $(0, x) = s(x) 

We shall see that the substitution 


£> x * £(t, x) f x 

uniformizes the solution of (5.2). Its restriction to £(t, x) = 0 uni¬ 
formizes the solution of (5.1). 


6. AN INVARIANT 

Let p — (pi, . . . , pi) denote a covector transforming like a 
gradient. A straightforward computation gives 

e Xf(x) a(x,-^je xf(x) = g(x,f x )\ m 

d - [igpp(Xf fxjfxx + g f (xj /«)]x m ”*"* + * • * 




Partial Differential Equations; Problems HI 


where a(x, d/dx) = g(x , d/dx) + g'(x> d/dx) + with g and g 
homogeneous of order m and m — 1 and 

, , w V d 2 g(xj x ) d 2 f 

g P p(x, fx)fxz h dpjdpk dXjdXk 

Hence, g(x , p ) is invariant under coordinate transformations and so 
is 

igpp(%> fz)fxx “I - g ( X )fx) 


7. A LEMMA 

The main result, which follows in the next section, will be proved 
by a reduction to a special case which we will treat now. Let t be 
an additional variable, let 

i,h- i-- 

be holomorphic differential operators, and consider the Cauchy 
problem 

-“7 = S&yfcWfc + Vj 

( 7 . 1 ) M 

Uj(0, t) = Wj( 0 , x) 

Furthermore, suppose that there exist integers m* such that 

(7.2) order bjk < m, — nth + 1 
It is no restriction to assume that all m,j > 0. 

Lemma 1. If the b jk , v j an d u>i are holomorphic for 
|<| < r,'\x | = |xi| + • • • |aJi| < R 
then (7.1) has a unique solution which is holomorphic for |x| < R 
and t small enough, |f| < const (R — |x|). 

remark 1. The uniqueness is obvious since (7.1) permits a 
calculation of the power series for u around the origin. 

REMARK 2. The proof uses the following lemma. 

Lemma 2. (Rosenbloom [5], Hormander [$]) If u(x) is holo¬ 

morphic for \x\ < R and 

- (R - \x\y 



*4* Lectures cm Modern Mathematics 


then 


du 

dx k 


3(r + 1) 

(R - M ) r+1 


proof. It suffices to take 1 = 1. Then 

u(y) 


— = L f , 

dx 2 iJ y (x-‘-'* dy 


'y(x- y) 2 

where y is the circle \x — y\ = «; 0 < e < R - |*|, 

du 
dx 


Hence 


e(R — \x\ — e) r 
Putting e = (R — \x \)/(r + 1) we get the desired result since 




< e < 3 


proof of lemma 1. We use successive approximation as follows 

dUj t o 

= V h Uj,o(0,x) = Wj(x) 

dUj t r _ 

= Wjfe, r (0, z) = 0 

for r > 0. Then the series 


( 7 - 3 ) *) = y My.rO, x) 

0 

converges to a solution. In fact, suppose that we have the estimate 
( 7 - 4 ) |«*,r- 1 | < cC r—1 r m *| t\ r ~ l /(R - |a:|) r+m * 

which holds for r = 1 provided c is large enough. Then 

T |&y*w*,r-i| < cC r r 1+m i\t\ r ~ 1 /{R — |x|) r+1+m > 

* 

provided C is large enough, depending on b jk , m and R. Integrating 
we find (7.4) for r and the series (7.3) converges for C\t\ < R _ | x |. 



Partial Differential Equations; Problems US 


8. UNIFORMIZATION 

We shall use a change of variables 

(8.1) ( t, x) —» (£(i, x), x) 

£ being holomorphic. Let us put 

(u ° £)(<, x) = u(£(t, x), x) 
so that, with indices denoting derivatives, we have 
3 3 

(8.2) — u ° £ = ° £, —u o { = f x u s ° + w* o { 

We say that a function 

«(£, *) 

is uniformized of order n by (8.1) if v o £ is holomorphic in t } x for 
any derivative v of u of order <n. In view of (8,2), it suffices for 
this that v is holomorphic when v is one of the functions 

«;(£, x) = u(£,x) 

and 0 < j < n, The next lemma is obtained from (8.2) by a 
straightforward computation. 

Lemma 3. Let 

a (:r, d/dx) = g (x, d/dx) + g' (x, d/dx) + • • • 
with g and g r homogeneous of order m and m — 1. Then 

[ a W j o { = ^ aj ^X, £ X) £xx, • . . , —^ ( Um—j ° |) 

where order a,- < j and, explicitly, 

«o = g(x, £*) 

3 1 

ax = g p (x, €*) — + - g PP (x, {*)!** + g\x, $ x ) 




144 Lectures on Modem Mathematics 


remark. The ay are independent of the coordinate system. We 
have seen in Section 2 that 


h(tj X ) iQpp(%t Zz)Zxx + g\x y Zx) 

is also invariant. 

Let us now pass to the variables t, x in (5.2). The result is 


(8.3) 


2ay Zx f (u m —j ° {) — V ° £ 

Uj o £ = Wj o £ for t — 0, 0 < j < m 

which can be written as follows 


(8.4) dt Uj ° * = ~^‘+ 1 ° & 0 < j < m - 1 

m 

gix, Zx) J t Um-l ° Z ~ Zt^Clj (x, Zx, Mm-,- ° £ « — 




UjO ( — Wj © £ for £ = 0, 0 < j < m 

Now choose £(£, x) so that it satisfies the Hamilton-Jacobi equation 

(8.5) it + g{x , £*) = 0 

{(0, x) = s(x) 

Notice that £(£, x) = 0 is a regular hypersurface in t, x-space for 
t small. Then, at the noncharacteristic points of S(Z) where 
g(x } Zx) t* 0 (8.5) is equivalent to 

(8.6) — Uj ° £ + ZtUj +1 ° £ = 0, 0 < j < m — 1 

^ Mm—X ° { + V «,• Zx, Mm —j o f = V o £ 

1 

Wy O £ = Wy o £ for t — 0, 0 < j < m 

which, by Lemma 1, has a holomorphic solution (choose my = j). 
Hence, by analytical continuation we have a uniformization theory. 




Partial Differential Equations; Problems 145 


Uniformization Theorem . If 

g(x, I*) 1* 0 on 5(f) 

and if f solves (8.5), then Cauchy's problem (5.2) has a solution 
w(f, x) which, close to 5(f), is uniformized of order <m by the sub¬ 
stitution u —► u o f: Its restriction to f(£, x) = 0 uniformizes w(0, x) 
of order <m. 

9. GEOMETRY OF THE UNIFORMIZATION 
The uniformizing mapping 

(9.1) t, f, x = f(*,z), a; 

sends Z = 0 into 5(f) and its Jacobean is f*. A function u{ f, x) 
which is uniformized by (9.1) is in general multivalued and ramified 
around the image K( f) in f, x-space of the variety f* = 0. We 
shall give a description of 2£(f). 

The Hamilton-Jacobi equation is closely connected with the 
Hamiltonian system 

(9.2) dx — g x> {x i p) dt, dp = —gr x (x, p) dt 

*[0, y] = y, p[0, y] = s v (y) 

The change of variables (x, t) —> (y, t) is holomorphic both ways for 
small t and reduces to the identity of t = 0. We shall express it by 
writing 

(9.3) f(t, x) = f[t, y] 

/( ) and /[ ] denoting different functions. 

Since dg (x, p) = (g x g P — g P 9x) dt = 0, g(x , p) is invariant on the 
curve (band) (9.2) and since p dx = mg(x y p) dt , the band is charac¬ 
teristic if g(y, s y ) = 0, that is, if it starts from a characteristic 
contact element y } s v (y). 

Furthermore, the form 

w = pdx — g dt 

is a relative invariant for (9.2). In fact, its differential depends 
only on the initial contact element (Cartan [1]). In our case, 
<o = ds ( y ) for t — 0, so that w is exact. Hence, there exists a func- 




146 Lectures on Modem Mathematics 


tion £[£, y] such that 

d i [t, y] = w, f[0, y] = s(y) 
Changing to the variables t y x, putting 


we get 


i(t> x) = Qt, y] 

ix = p[t, y], it = -y(x, p) + g(x, i x ) = 0 
i(0, x) = s(x) 


so that £ is the unique solution of (8.5). Integrating for y constant 
we also get 

di [t, y] = (m ~~ 1 )g(x, p ) dt = (m - l)y(y, s y ) dt 
so that, more explicitly, 

i(t, x) = a(y) + (m - l)*y(y, * y ) 

If we now define a bicharacteristic curve as the projection of (9.2) 
into X-space: x — x[t, y ] issuing from a characteristic contact ele¬ 
ment of S{i), we get the following lemma: 

Lemma 4- K(£) is the union of bicharacteristics issuing from the 

characteristic part T(£) of S(i). Consider the differential operator 


which, if 


L — -(- a i 

dt 


d d d 

dt = 7t + 9p(x ’ fl) Tx 


denotes differentiation along the curve x = x[t , y] (i.e., for y con¬ 
stant), can be written as 

L = — + h(t, x) 


Since dg/dt = 0, L commutes with g. For future use we note the 
following: 





Partial Differential Equations; Problems 147 


Lemma 5. We have 

(9.4) ?(«m ° £ - w m o {) = (» - aw) o £ + t hoi (<, a:) 

(9.5) Lgum o £ = £ hoi («, a:) 

proof. By (8.2) 

d 

9Um° £ = — Um -1 0 £ = hoi (<, x) 

dfr 

and, applying Lemma 3 to a(u — «?) = « — aw, we have 

m 

— <w m ) + | — Wm-y) ° £ = 0> — aw?) o £ 

Since the ai, a 2 , . . . do not contain d/dt, (8.4) shows that the sum 
on the left vanishes with t. Hence (9.4) follows. Applying 
Lemma 3 to au x = iq, we have (8.4) with m replaced by m + 1 . 
Hence, since g commutes with L, we have 

m 

LgUm O ( + g ^ dj-Um+l-j ° £ = O £ 

2 

so that (9.5) follows. 

10. THE HIGHER DERIVATIVES 
Since 

1 * 

My+i = J > 0 

<7 

Uj © f a hol (J, s), 0 < j < m 

we have 

W m+ y_i ° f — /o + ” + * ' ' + ^ 7 ' j > 0 
9 9 

where /o, . . . , /y are holomorphic in t , s, but not uniquely deter¬ 
mined. Clearly fz is unique on g — 0 , a choice of / 2 determines 
fz on g — 0 and so on. It is possible to give explicit formulas for 
these functions and we shall do this forj « 1 . Let us write 


F(t , x) = 0 mod G(t , x) 


148 Lectures on Modem Mathematics 


when 

F(t, x) = G(t, x) hoi ( t , x) 

Theorem 1. Let TJ(t, x) be the solution of 

\it + hu( ' t ’ x ^ = 0> 

U(0, x) — (v — aw) o £(0, x) 

Then, 

(10.1) Um ° g - W m O g + ^ mod t 

g(x, Zx) 

remark. Let G(t, x) be the solution of 

\jt + *)] G ^’ x ) = 0 

G(0, x) = 1 

along the characteristic band ( 9 . 2 ). Then 

U{t, x) = G(t, x)[y(£, y) - a (y, w(£, y)] i=s ( v) 

PROOF. Put 

V(t , x) = g(u m U(t, x) 

By Lemma 5, 

LV a 0 mod g 
V a 0 mod t 

so that, passing to the variables t , y 

(j t + h[t, y]) V[t, y] = g(y, s„) hoi [t, y] 

F[ 0 , y] = 0 

It follows from this that 

V[t, y] = tg(y, s„) hoi [t, y] 

that is, 

V a 0 mod tg 

which is equivalent to (10.1). The proof of the remark is obvious. 




Partial Differential Equations; Problems 149 


REFERENCES FOR LECTURE I 

1. P. C. Rosenbloom, “Linear partial differential equations,” Surveys Appl. 
Math. 6 (1958) 43-195. 

2. L. G&rding, “Some trends and problems in the theory of linear partial 
differential operators,” Proc. Int. Congress of Math. t Edinburgh (1958). 

3. L. Hormander, Partial differential operators , Academic Press, New York, 
Berlin (1963). 

4. J. Peetre, “Another approach to elliptic boundary problems,” Comm. Pure 
Appl. Math. 14 (1961) 711-731. 

5. J. B. Lopatinski, “On a method of reducing boundary problems for a system 
of elliptic type to regular equations,” Ukrain. Mat. Zur. 6 (1953) 123-151. 

6. F. Noether, “tJber eine Klasse singularer Integralgleichungen,” Math. 
Ann. 82 (1921) 42-63. 

7. F. Atiyah and I. M. Singer, “The index of elliptic operators on compact 
manifolds,” to be published. 

8. R. Bott, “The index theorem for homogeneous differential operators,” to 
be published. 

9. J. Peetre, “Mixed problems for higher order elliptic equations in two vari¬ 
ables I.” Annali della Scuola Normale Superiore di Pisa. Serie III, Vol. XV, 
Fasc. IV (1961) 337-353. 

10. S. L. Sobolev, “Some applications of functional analysis to mathematical 
physics,” Izdat. Leningrad . Gos. Univ. (1950). 

11. S. Campanato, “II teorema di immersione di Sobolev per una classe di 
aperti non dotati della propriety di cono,” Ricerche Mat. 11 (1962) 103-122. 

12. V. A. Kondratev, “On the solvability of the first boundary value problem 
for elliptic equations.” Dokl. Akad. Nauk SSSR 36 (1960), 771—774. 

13. R. Courant and P. Lax, “The propagation of discontinuities in wave 
motion,” Proc. Nat. Acad. Set. 42 (1956) 872-876. 

14. B. Riemann, Ueber die Fortplanzung ebener Luftwellen von endlicher Schwin- 
gungsweite. Collected Works, Dover Publications. 

15. P. Lax, “Asymptotic solutions of oscillatory initial value problems,” Duke 
Math. J. 24.4 (1957) 627-646. 

16. G. Ludwig, “Exact and asymptotic solutions of the Cauchy problem,” 
Pure and Appl. Math. 13.3 (1960) 473-508. 

17. V. Thom6e, “Existence proofs for mixed problems in the theory of linear 
hyperbolic equations in two variables by means of the continuity method, 
Math. Scand. 6 (1958) 5-32. 

18. C. F. D. Duff, “Mixed problems for hyperbolic equations of general order, 
Canad. J. Math. 9 (1959) 195-221. 

19. T. Carleman, “Propri6t6s asymptotiques des fonctions fondamentales des 
membranes vibrantes.” C. R. du huitibme Congres des mathbmaticiens 
scandinaves, Stockholm (1934). 

20. P. D. Lax, C. S. Morawetz, and R. S. Phillips, “The exponential decay of 
solutions of the wave equation in the exterior of a star-shaped obstacle,” 
Bull. Amer. Math. Soc. 68.6 (1962) 593-595. 

21. G. Frobenius, “tlber das Pfaffsche problem,” J. Math. 82 (1877) 230-315. 


150 Lectures on Modern Mathematics 


22. P. Meth4e, “Sur les distributions invariantes dens le groupe des rotations 
de Lorentz,” Comm. Math. Hefo. 28 (1954) 224-261. 

23. L. G&rding, “Distributions invariantes,” S6minaire sur les Equations aux 
d<§riv6es partielles. College de France. November 1961-May 1962 
(mimeographed). 


REFERENCES FOR LECTURE II 

1. E. Cartan, Les invariants inUgraux , Paris (1932). 

2. L. G&rding, T. Kotake, and J. Leray, “Uniformisation et developpment 
asymptotique de la solution du problSme de Cauchy lineaire, k donn^es 
holomorphes (Problem de Cauchy I bifl et VI.)” To be published in Journal 
de Mathtmatiques , pures et appliqu4es. 

3. L. Hormander, Partial differential operators, Berlin (1963). 

4. J. Leray, “Uniformisation de la solution du problSme analytique de Cauchy, 
pr6s de la vari6t6 qui porte les donn£es de Cauchy (Probl&me de Cauchy I.)” 

Bull . Soc . Math . France 6.86 (1957) 389-429. 

5. P. C. Rosenbloom, The Majorant method, Proc. Sympos. Pure Math. IV 
51-72. Am. Math. Soc., Providence, R.I. (1961). 




5 

Quasiconformal 
Mappings and. 
Their Applications 

Lars V . Ahlfors 


1. INTRODUCTION 

It would be hopeless and confusing to try to trace the modern 
development in the whole wide field of analysis. For instance, the 
recent years have seen an enormous upswing in the theory of partial 
differential equations, a theory that heretofore had been limited to 
the comparatively mildmannered equations that occur in physical 
applications. On another front, the theory of analytic functions 
of several variables and of manifolds with complex structures have 
received a strong impetus from methods of topology and algebraic 
geometry. Since the interactions between various branches of 
mathematics have a healthy tendency to grow stronger rather than 
weaker, such progress cannot be ignored even in a partial survey, 
but this chapter shall be restricted to its indirect repercussions. 
The core of classical analysis is the theory of functions of one com¬ 
plex variable with its direct traditions from Cauchy and Riemann. 
In some respects this is a very narrow area, but in contrast with 
other branches of analysis it is one that was already highly devel¬ 
oped at the turn of the century, and in which further progress is 
therefore both difficult and very significant. 

161 



152 Lectures on Modem Mathematics 


Let us first throw a quick glance at the status of classical function 
theory in the 1920's, say, a period that set the stage for this author's 
generation. The general theory of uniformization had just been 
completed, and thereby a formidable stumbling block had been 
removed. R. Nevanlinna had founded his monumental theory of 
meromorphic functions which by its very perfection seemed to 
block further progress in this direction. Many mathematicians 
were intrigued by the results achieved in the theory of univalent 
functions, and whether we believed in its profundity or dismissed 
it as a curiosity, the Bieberbach conjecture was and still remains an 
unmistakable challenge. 

In view of the difficulty and abundance of the problems, it seemed 
clear that new tools would be needed. Such tools were indeed 
forthcoming. This author admires the variational method of 
M. Schiffer, which seemed to bring the Bieberbach conjecture within 
easy reach. Another essential contribution was the Bergman 
kernel function which established the first connection with Hilbert 
space theory. The theory of extremal length which had its roots in 
the work of H. Grotzsch, but rose from obscurity to brilliance thanks 
to the vision of A. Beurling should also be mentioned. 

Equally important, and in their ultimate consequences perhaps 
overwhelmingly so, were the observations that poured forth from 
the uncanny brain of Teichmuller. For better or for worse, his 
ghost has risen from an early grave to haunt a great deal of 
the thinking that has gone on in postwar function theory. An 
excellent way to acquaint the mathematical public with the 
trends in my special field is to relate some of the experiences of 
this author and fellow specialists in dealing with the heritage of 
Teichmuller. 

It should be emphasized that this brief survey does not and 
cannot do justice to all that has been done in the field of one complex 
variable. To mention all would be to spread it out so thin that 
nobody would be much wiser. The intention is quite the contrary, 
namely to disseminate knowledge of a method whose very name 
might easily create the false impression of a spineless generalization 
of conformal mapping. Our aim will be achieved if we can convey 
that quasiconformal mappings are intimately and essentially, even 
surprisingly, connected with the most classical problems in the 
theory of analytic functions. 



Quasiconfonnal Mappings and Their Applications 153 


2. QUASICONFORMAL AND CONFORMAL MAPPINGS 


Historically, the first mention of quasiconformal mappings (by a 
different name) is in a small paper in 1928 of H. Grotzsch. He 
solves the extremal problem of mapping a square on a rectangle so 
that the vertices correspond, and so that the mapping is as nearly 
conformal as possible. The problem was treated primarily as a 
curiosity, but the paper also contains a more significant remark to the 
effect that Picard’s theorem is valid for quasiconformal mappings. 

Like all of Grotzsch’s work, this paper was buried in a small 
journal and therefore very slow to gain recognition. This author 
learned of it by word of mouth in 1931, but did not use it until 1936 
in a paper that was primarily devoted to quite different questions. 
Meanwhile, in 1935, Lavrentieff had published a paper on what he 
called “fonctions presque-analytiques” and which were practically 
the class considered by Grotzsch. It is not clear how the name 
‘‘quasiconformal mappings” originated, but by this time the notion 
began to be generally accepted. In the beginning it emerged pri¬ 
marily as a tool in the study of conformal mappings. At the same 
time it was used, quite independently, by Morrey and the school of 
Lavrentieff for the study of elliptic equations in two variables. In 
a few years it became increasingly clear that quasiconformal 
mappings merited a theory of their own, a view that gained its 
strongest support from the more prophetic than rigorous writings 
of Teichmuller in 1938-1940. 

Now, to begin the mathematical part of this chapter, our first 
concern is with the precise definition of a quasiconformal mapping. 
If / is a differentiable mapping, it is clear that infinitesimal circles 
are mapped on infinitesimal ellipses. The ratio a/b of the major 
and minor axes of the ellipse is the dilatation at a point, and the 
mapping is quasiconformal if the dilatation is bounded. Analyt¬ 
ically, it is convenient to write 


df = fzdz + fi dz 


It is sufficient to consider the case where / is sensepreserving. Then 
\h\ < \fz\j and the axes of the ellipse are (|/*| + |/*|) \dz\ and 
(|/*l ~ |/*|) \dz\. The condition of quasiconformality is thus 


1/4 + \fi\ 
1/4 - 1/4 


^ K < 


00 






164 Lectures on Modern Mathematics 


and when K is given we speak of a A-quasiconformal mapping. 
The same condition can be expressed by |/*| S k\f z \ with k < 1. 
It is obvious from the definition that a mapping and its inverse have 
the same dilatation, and also that a conformal mapping does not 
change the dilatation. Similarly, the product of two quasicon- 
formal mappings is again quasiconformal. In other words, the 
basic composition properties of conformal mappings remain in force. 

It was recognized quite early that differentiability was too strong a 
condition, and in the beginning makeshift arrangements were intro¬ 
duced to allow for a more general behavior at least at some points. 
But the most important need was for compactness, which can be 
attained in two ways—geometrically and analytically. Geo¬ 
metrically, a quadrilateral Q is the conformal image of a rectangle, 
and the ratio of the sides of the rectangle serves as the modulus 
m of Q. Suppose that the homeomorphism / maps Q on Q ' with 
modulus m'. Then / is said to be /^-quasiconformal if m'/m ^ K 
for all Q. It is easy to see that this coincides with the original 
definition when / is differentiable, and that the resulting class of 
K-quasi conformal mappings is compact. 

Analytically, we define a if-quasiconformal mapping as one which 
satisfies a partial differential equation 

(2.1) h = M<*)/. 

with |/tt| ^ k = (K — 1 )/(K + 1). The coefficient /x is called the 
complex dilatation. If it is continuous, we have the same class as 
before, but the point is that /x need only be measurable and / a weak 
solution of the equation. More precisely, / is to be a homeomorph¬ 
ism whose distributional derivatives are locally square integrable 
and satisfy (2.1). 

It is a nontrivial and important fact that these definitions are 
equivalent. As a sad commentary on the communications between 
mathematicians, we might mention that the essence of the theorem 
is contained in Morrey’s paper of 1938, but that its relevance for 
quasiconformal mappings was not clearly realized until 1956 (by 
Bers). 

Equation (2.1) is classical, and is named after Beltrami who wrote 
it in a different form. Beltrami obtained it as an analogue of the 
Cauchy-Riemann equations on a surface in space. A little more 
generally, consider a Riemannian metric in the z-plane. Its funda- 






Quasiconformal Mappings and Their Applications 155 


mental form can be written as 

ds 2 = p(z) 2 \dz + pdz\ 2 

with p(z) > 0, | M | < 1. Clearly, a mapping w = f(z) is conformal 
from the Riemannian metric to the euclidean metric of the w-plane, 
if and only if the ratio df/{dz + p dz) is independent of the direction 
of dz, and this is so exactly when (2.1) is fulfilled. Allowing for a 
sufficiently general concept of a Riemannian metric, we recognize 
the fundamental fact that a quasiconformal mapping is nothing more 
than a conformal mapping with respect to a metric determined by the 
complex dilatation . 

This observation, by itself, would be quite sufficient to give quasi¬ 
conformal mappings a respectable status, but we shall see that the 
tie-in with analytic functions is much more direct. 

3. EXTREMAL PROBLEMS 

It has already been mentioned that quasiconformal mappings 
were originally used primarily as a tool in the study of conformal 
mappings. Many problems in function theory, especially if con¬ 
nected with geometric questions, require the use of cleverly selected 
auxiliary functions or mappings. If we insist that the mappings 
be conformal, the construction may be very difficult, because con¬ 
formal mappings are by nature rather rigid. In many cases quasi¬ 
conformal mappings will serve the same purpose, and they are much 
easier to find. For instance, many problems concerned with the 
parabolic or hyperbolic type of Riemann surfaces were solved by 
this technique. We shall not dwell on this aspect, because it is 
relatively uninteresting. Quasiconformal mappings have found 
their place in the toolkit of the well-equipped analyst, and that is 
that. 

Much more important is the connection with extremal problems. 
Mathematicians have always been fascinated by maxima and 
minima, both for their own sake and because of the special circum¬ 
stances that occur at extreme points. Nowhere are extremal prob¬ 
lems as fruitful as in the theory of analytic functions. Analytic 
proofs depend extensively on estimates, and to find the best possible 
estimate is not only a source of esthetic satisfaction, but will almost 
invariably lead to a better understanding of the situation being 
studied. 


156 Lectures on Modern Mathematics 


As has been pointed out, the very genesis of quasiconformal 
mappings was connected with the elementary extremal problem 
formulated by Grotzsch. Teichmliller was the first to extract a 
general principle: In a class of mappings it is required to find one 
whose maximal dilatation is a minimum. It is to be expected that 
the solution is unique, and that the extremal mapping is character¬ 
ized by simple properties. 

Let us illustrate by a few examples. First, as a very slight gen¬ 
eralization of GrotzsclTs problem we may look for the extremal 
mapping of one quadrilateral on another. Because dilatation is 
invariant under conformal mapping it is natural to replace the 
quadrilaterals by conformally equivalent rectangles. Also, it is to 
be expected, and quite easy to prove, that the extremal mapping is 
represented by the affine mapping of one rectangle on the other. 
Let us now replace the quadrilaterals by pentagons. Teichmliller 
proves that they can be mapped conformally, on polygons, as shown 
in Figure 1, such that a:b = a':6'. The extremal mapping will be 
represented by a stretching in the direction of the y- axis. This is 
already a sophisticated result, mainly because it is not known 
beforehand which side will contain the concave angle. 

The problem is more complicated for hexagons, but can still be 
solved by the same method. More general problems call for the 
extremal mapping of a multiply connected region on another, or 
even of a closed Riemann surface on another of the same genus. 
Efforts to solve these problems had failed, but in a flash of genius 
Teichmliller came forth with a daring conjecture which he himself 
admitted was based on almost nothing. In the few cases that could 
be solved explicitly he observed that the solution was always given 



Figure 1 





Quasiconformal Mappings and Their Applications 157 


by a conformal mapping, followed by an affine mapping, and by 
another conformal mapping. Teichmuller conjectured that the 
same would be true in all problems. For topological reasons, how¬ 
ever, the conformal mappings cannot be single-valued mappings, 
except in the simplest cases, and by heuristic reasoning (we could 
almost say by slight of hand), he concluded that they must be 
determined, locally, by quadratic differentials with certain special' 
properties. For two closed Riemann surfaces S x and S 2 the con¬ 
jecture would be as follows: there exists a unique pair of analytic 
quadratic differentials # 1 ( 21 ) dz x and # 2 ( 22 ) dz 2 2 , and a positive 
constant k < 1, such that the extremal mapping can be expressed 
in local coordinates by 

dz 2 = \/&i dz x + k y/d~\ dz[ 

After hard work Teichmuller finally managed to prove his con¬ 
jecture, but it took several more years to produce a reasonably neat 
proof. Even now, the available proofs leave much to be desired in 
way of directness and simplicity. 

The most remarkable feature of this result is the direct connection 
it establishes between quasiconformal mappings and quadratic 
differentials. This connection is entirely unexpected and is a sure 
sign that the role of quasiconformal mappings in function theory is 
far from superficial. Furthermore, any classically trained analyst 
cannot dismiss as pure coincidence that the number of linearly inde¬ 
pendent quadratic differentials is precisely equal to the number of 
parameters, or moduli, which according to Riemann characterize a 
closed Riemann surface of given genus. Teichmuller does not fail 
to make the most of this observation. 

We now mention a slight generalization of TeichmulleFs problem. 
If in the problem of polygons the vertices become everywhere 
dense, we are faced with the problem of a homeomorphic map of a 
disk on another with a given boundary correspondence. It is 
known under what conditions there exists a quasiconformal mapping 
with the given boundary correspondence, and Teichmuller’s 
extremal problem makes sense. For a long time it was thought that 
this problem must have a unique solution, and that failure to prove 
it was only a sign of lacking technique. Quite recently, however, 
K. Strebel made a significant contribution to this problem, and, 
among other things, produced a beautifully simple counterexample. 
To describe this example, we remark that the disk can obviously be 



158 Lectures on Modem Mathematics 



Figure £ 


replaced by any simply connected region. We choose to consider a 
region consisting of the lower half plane and a vertical half strip 
(Figure 2). Map this region onto itself by a stretching in the 
vertical direction. This produces a boundary correspondence which 
happens to be the identity on the horizontal part of the boundary. 
It is not difficult to show that the stretching is extremal for this cor¬ 
respondence. On the other hand we can find any number of 
mappings with the same boundary correspondence and the same 
maximal dilatation, for instance by use of the identity mapping in 
the lower half plane and the stretching in the “chimney.” To 
obtain uniqueness it would thus be necessary to introduce some 
additional condition, and nobody knows what the proper condition 
is. 


4. THE GENERALIZED RIEMANN MAPPING THEOREM 

We return to our earlier remark that a solution of the Beltrami 
equation — nf z is a conformal mapping with respect to a Rieman- 
nian metric. If n is very regular, for instance analytic in the real 
sense, and defined in the whole plane, it is classical that the metric 
defines a structure of Riemann surface, and the general uniformiza- 
tion theorem asserts the existence of a conformal mapping on the 
whole plane or on the unit disk. As it happens, the quasicon¬ 
formality rules out the latter case from the start. 

The result is true without any regularity conditions whatsoever, 
except of course for the condition |/x| g k < 1 of quasiconformality, 
and it is this theorem that goes under the name of the generalized 
Riemann mapping theorem. The proof is implicit in the work of 
Morrey in 1938, but today we have a simpler proof based on the 



Quasiconformcd Mappings and Their Applications 


159 


theory of singular integrals of Calderon and Zygmund. In this 
form, the proof is due to Boyarskii and Vekua. 

We sketch the proof under the simplifying assumption that p has 
compact support (this is a harmless restriction that is easy to 
remove). The mapping function is then analytic at oo y and it may 
be normalized so that f(z) — z —> 0 for z —> oo. The familiar 
representation 


(4.1) /(f)- -i 

* J Z “ f 

is valid as soon as /* £ L p for some p > 
- _ 1 f fz dx dy 

U tJ (z-t ) 2 


+ f 

2 . Differentiation gives 

+ 1 


where the integral has to be interpreted as a Cauchy principal 
value. According to Calderon and Zygmund the integral trans¬ 
form may be regarded as a bounded operator in L p with a norm C v 
which tends to 1 for p —> 2. The Beltrami equation takes the form 

yf z = pT(pf z ) + p 


The norm of pT is ^ kC p , and therefore <1 for a suitable p > 2. 
We find that the Neumann series 


fz — pfz ~ p + pTp + pT^Tp + • • • 

converges in L p , and substitution in (4.1) yields the desired solution. 

It is perhaps interesting to note that we have not used the uni- 
formization theorem. In fact, it is even possible to prove the 
uniformization by this method. With relatively simple proofs of 
the Calderon-Zygmund inequality available, this is a reasonable 
approach. 

The result can be sharpened. Namely, if p depends differentiably 
or analytically on a parameter, then the solution /, normalized so 
as to be unique, will also depend differentially or analytically on 
the parameter. 

Finally, instead of mapping the whole plane on itself we can map 
the unit disk on itself with a given complex dilatation p . The 
easiest way to arrive at this result is to extend p to the whole plane 
by means of an obvious reflexion process. 



160 Lectures on Modem Mathematics 


5. THE PROBLEM OF MODULI 

The most important application of quasiconformal mappings is 
to the problem of moduli, for instance for compact Riemann surfaces 
of fixed genus g. Let us first describe Teichmuller’s own approach, 
in particular for the most elementary case of g = 1 . 

All toruses are to be considered as points in a space, and the 
distance should depend on the maximal dilatation of the extremal 
quasiconformal mapping of one torus on another. The triangle 
inequality is satisfied if the distance is taken to be the logarithm of 
the least maximal dilatation. Consider the familiar representation 
of the torus as a period parallelogram. First, the shape of the torus 
depends only on the complex ratio r — o> 2 /wi of the periods. Sec¬ 
ond, it depends on the period net rather than on the particular 
choice of a period parallelogram. In other words, the torus does 
not change when r is subjected to a modular transformation—a 
complication which it is preferable to avoid. Accordingly, we shall 
consider two period parallelograms as the same torus only if they 
are similar. With this convention the space of toruses is repre¬ 
sented by the upper half plane, and we find that the Teichmuller 
distance is nothing but the noneuclidean distance between the cor¬ 
responding points, a thoroughly satisfactory result. 

For g > 1 Teichmuller proposes a similar procedure. With 
deliberate oversimplification here, TeichmiilleFs approach is as 
follows. A point is not only a compact Riemann surface of genus g , 
but it is a surface together with a choice of generators of the funda¬ 
mental group. A topological mapping automatically induces a 
correspondence of the generators, and the smallest maximum dilata¬ 
tion is to be determined in the class of mappings that induce the 
same correspondence. This defines the Teichmuller distance, and 
the metric space obtained in this manner is the Teichmuller space 
T g . He proves that it is topologically equivalent to R*°~ 6 . As a 
topological theorem this is a rediscovery of a result of Fricke, or so 
it would seem, for Fricke is almost undecipherable for modern 
mathematicians. 

6 . THE COMPLEX STRUCTURE 

It is not for lack of admiration that this author has given such a 
brief account of Teichmuller’s theory of moduli. The reason is 



Quasiconformal Mappings and Their Applications 161 


that his method has been superseded by simpler and more efficient 
procedures. Curiously enough, it was by abandoning the idea of 
extremal quasiconformal mappings that further progress became 
possible. Interesting and important as it is, the extremal problem 
seems to lead very rapidly to serious technical difficulties, and the 
overcoming of these difficulties is not essential for the problem of 
moduli. 

The metrical structure obtained by Teichmuller is closer to a 
Finsler than to a Riemann metric. Actually, the classical problem 
calls for a complex structure, and Teichmuller was well aware that 
his method did not yet solve this problem. In fact, he refutes his 
own method and says explicitly that it is no good for the purpose of 
establishing a complex structure. This prediction turned out to be 
wrong. 

In a final effort Teichmuller produced a solution of the structure 
problem, by an entirely different method, but it was so cumbersome 
that it is doubtful whether anybody else has checked all the details. 
Many years later E. Rauch and this author found, independently of 
each other that the original method, used primarily as a convenient 
parametrization, does lead fairly easily to the existence of a natural 
complex structure on the space T g . The most significant develop¬ 
ment, however, is due to L. Rers. He found, as has been indicated, 
that Teichmuller’s extremal problem can be dispensed with alto¬ 
gether, and that the most direct approach is via a closer study of the 
Beltrami equation. In other words, it is the full theory of quasi¬ 
conformal mappings, rather than a special problem, which is the 
right clue to the moduli of Riemann surfaces. 

It is only fair to mention, at this point, that the algebraists have 
also solved the problem of moduli, in some sense even more com¬ 
pletely than the analysts. Because of the different language, it is at 
present difficult to compare the algebraic and analytic methods, but 
it would seem that both have their own advantages. We shall 
treat only the analytic problem. 

Let Wq be a closed Riemann surface of genus g > 0. The uni¬ 
versal covering surface of Wo can be mapped conformally on the 
upper half plane H with the cover transformations corresponding to 
a discontinuous group To of Mobius transformations, a Fuchsian 
group. Clearly, W o can be identified with the quotient space 
H/r°. If W is another surface of the same genus, the correspond¬ 
ing group T is isomorphic to r°, and two such W f s are conformally 




162 Lectures on Modem Mathematics 


equivalent if and only if the groups r are conjugate subgroups of 
the full group of noneuclidean motions. The conformally different 
Riemann surfaces are thus identified with conjugacy classes. A 
point in Teiehmiiller space, however, will be represented by a group 
T together with an isomorphism 0 of T° on T, again with the under¬ 
standing that two isomorphisms are regarded as equivalent if they 
differ by an inner automorphism. More precisely, the space we 
obtain is the Teiehmiiller space T(W 0 ) with W 0 as a reference 
point, but it is perfectly obvious how to change the reference point. 

To study the structure of T(W 0 ) we need an analytic representa¬ 
tion. For this purpose we consider the linear space f?(r°) of 
Beltrami differentials: its elements are measurable bounded func¬ 
tions /x, defined on H f such that the expression /x dz/dz is invariant 
under r°. Assume, in particular, that ||m|L < 1 . By the general¬ 
ized Riemann mapping theorem there exists a solution of the 
Beltrami equation/ 5 = /x/z which maps H homeomorphically onto 
itself. If A £ T° it follows at once that/(Az) is another solution, 
and hence that f(Az) = A'tf(z)] where A* is conformal, and con¬ 
sequently a Mobius transformation. The mapping 0 M : A —> A* is 
an isomorphism which determines a group and a point of T(W o). 
Because any two closed Riemann surfaces of the same genus are 
quasiconformally equivalent, it is easy to see that all points of 
T(Wq) can be obtained in this way. Different m may determine the 
same point, however, and it becomes essential to investigate when 
this is so. 

Let the mapping function associated with /x be normalized so that 
0, 1, oo remain fixed, and let it be denoted by/ M . It is easily seen 
that /xi and M 2 determine the same Teiehmiiller point if and only if 
/mi = /m 2 on the real axis. From a global point of view this condition 
is rather unmanageable, but the corresponding infinitesimal condi¬ 
tion is quite simple. Because the reference point W 0 is arbitrary, 
it is sufficient to study the behavior of / M for small m- For any 
v G R(r°) we introduce the Fr6chet derivative 

f tr — z 

f[v] = lim^ -- 

t 

which is known to exist. Clearly, v produces an infinitesimally 
vanishing change if and only if f[v] ~ 0 on the real axis. This is a 
linear condition on v, and a simple computation which we shall not 




Quasiconformal Mappings and Their Applications 163 


reproduce shows that it can be expressed by 

(6.1) J a/rt v<pdxdy = 0 

where <p runs though all quadratic differentials, that is, all analytic 
functions such that <p dz 2 is invariant under r°. Let N{T°) denote 
the subspace of B(r°) for which (6.1) is satisfied. Because there 
are 3g — 3 linearly independent quadratic differentials, the quotient 
space B(T°)/N(T°) has 3g — 3 complex dimensions. If vi, . . . , 
—3 belong to linearly independent cosets, a simple application of 
the implicit function theorem shows that the Teichmiiller points 
near the origin can be uniquely represented by n — t\v i + ■ • • + 
T3g—3vz g —z where n, ... , tz b —s are complex parameters. A some¬ 
what deeper study shows that these parameters define a complex 
structure on T 0 . This neat derivation of the complex structure is 
due to L. Bers. 

To complete the picture, let us mention that there is a natural 
Riemannian metric on T g . In fact, the tangents are in one-to-one 
correspondence with the quadratic differentials, and it suffices to 
introduce a Hermitian inner product on the space of quadratic 
differentials. Such a product was defined, years ago, by H. Peters- 
son, in the theory of automorphic forms. Surprisingly enough, 
quite a few properties of the Riemannian metric can be proved by 
explicit calculations. For instance, it is Kahlerian, and the analytic 
sectional curvatures are negative. An important open problem is 
the connection between the Riemannian structure and the Teich- 
muller metric. 

7. THE GLOBAL APPROACH 

There is a more direct method which permits us to introduce the 
complex structure in a single step, without passing to infinitesimal 
derivations. 

Given a Beltrami differential m we constructed f“ as a mapping of 
the upper half plane on itself with complex dilatation n. To obtain 
the mapping, n is extended to the lower half plane by reflection, and 
this defines /“ by the generalized Riemann mapping theorem as a 
mapping of the whole plane upon itself. Nothing prevents us, 
however, from using another extension of n, for instance to set 
m = 0 in the lower half plane. The corresponding mapp ing func- 




164 Lectures on Modem Mathematics 


tion fn is analytic in the lower half plane; thus we obtain a quasi- 
conformal mapping of U on a Jordan region ft together with a con¬ 
formal mapping of the lower half plane 17* on its complement 0*. 
Moreover, if we set fp(Az) = Ap[fp(z)] } then A„ is still a linear 
transformation, and we obtain a new group r M , isomorphic to r°, 
which is discontinuous in the invariant regions 0 and 12*. Such 
groups are called quasi-Fuchsian. 

Because fp is determined only up to a linear transformation, it is 
natural to pass to its Schwarzian derivative, usually denoted by 
{fp } z }, defined in the lower half plane. The interesting observation 
is that this turns out to be a quadratic differential <Pp with respect to 
T°. Moreover, completely determines the values of / M on the real 
axis, up to a linear transformation, and thereby the group r„. It 
follows that T(W 0 ) can be represented as a subset of Q(r°), the 
space of quadratic differentials, and hence as a set in C 3ff ~ 3 . This 
set is open and bounded. The boundedness is an easy consequence 
of a theorem of Z. Nehari, but the openness is not so elementary, 
and at present it cannot be proved directly except by use of infini¬ 
tesimal deformations. The proof shows that the complex structure 
induced by the imbedding is the same that we have already 
introduced. 

As a bounded region T g has a Bergman kernel function and a 
Riemannian metric deduced from the kernel function. It is not 
known how the Bergman metric is related to the Petersson metric. 
The nature of the boundary and its role for the problem of moduli is 
also an open question. 

Although we have not been very explicit, it should be clear from 
what has been said that T g admits a group M g of isometries with the 
property that Teichmuller points which correspond to each other 
under such an isometry represent conformally equivalent Riemann 
surfaces. We call M g the modular group, because it is the exact 
analogue of the modular group in the half plane. The quotient 
space T g /M g is the space of Riemann surfaces of genus g . It is a 
complex space, but not a manifold, for it has singularities at the 
fixpoints of elements of the group M g . The fixpoints occur at points 
which correspond to surfaces with conformal self-mappings, and 
E. Rauch has determined all cases where the singularities are 
nonremovable. 



6 

Differential Topology 

J. Milnor 


The basic objects studied in differential topology are smooth mani¬ 
folds, sometimes with boundary, and smooth mappings between 
such manifolds. Here the word “smooth” is used to mean “differ¬ 
entiable of class C°°.” To give a rough idea of the flavor of this 
field, let us list a few of its central problems. 

The Diffeomorphism Problem. Given two smooth manifolds M and 
M f , how can we decide whether or not there is a diffeomorphism 
from M to M ' (i.e., a smooth homeomorphism with smooth inverse) ? 

The Cobordism Problem. Given a smooth compact manifold M 
without boundary, does there exist a smooth compact manifold W 
whose boundary is equal to M? We may refine this problem by 
putting extra structures on both M and W. For example, we can 
require an orientation, or a weakly complex structure, or we may 
require that M and W should be ^-connected (compare [10]). 

The Imbedding Problem. Given M and M f , does there exist a 
smooth imbedding If so, can we classify all such 

imbeddings? For example, the problem of classifying imbeddings 
of the circle in 3-space forms the field of “knot theory.” 

166 




166 Lectures on Modem Mathematics 


CHARACTERIZATIONS OF THE n-SPHERE 

Let us single out one particular case of the diffeomorphism prob¬ 
lem for consideration, namely the problem of characterizing the 
n-sphere. The many different tools which can be brought to bear 
on this one question will provide a survey of much of the field of 
differential topology. 

First let us ask what conditions on a smooth manifold guarantee 
that it is homeomorphic to S n . 

The first such characterization is due to Reeb [16]. Let M be a 
smooth n-dimensional manifold. By a Morse function f on M will 
be meant a smooth real valued function whose critical points are 
all nondegenerate. Thus in a neighborhood of each critical point 
we can choose local coordinates m f . . . , w n sothat/ = constant ± 
Wi 2 ± * • * ± u n 2 . (Morse [12, Lemma 4] or [11, §2.2].) 

Theorem 1. If M is compact , without boundary , and possesses a 
Morse function with only two critical points , then M is homeomorphic 
to S n . 

outline of proof. (See Figure 1.) Let mo, mi denote the 
minimum and maximum of /. Thus in a neighborhood of its mini¬ 
mum point we have 

/ = mo + Ui 2 + • * • + u n 2 

It follows that the set of points x where f(x) < mo + e is diffeo- 
morphic to the disk D n . Similarly, the set of x with/(x) > mj — e 



Figure 1 



Differential Topology 167 


is diffeomorphic to D n . But, deforming M into itself along the 
gradient lines of the function/[i.e., along the orthogonal trajectories 
of the surfaces/ -1 (constant)] we can slide the disk, / < m 0 + *, up so 
that it covers precisely the disk / < m x — e. (Compare [7] or [11, 
§§3> 4].) Thus M can be considered as the union of two imbedded 
disks which intersect only along their common boundary. But this 
implies that M is homeomorphic to S n . To see this, consider first 
the following. 

Lemma 1. Any homeomorphism h:S n ~~ x —► extends to a 

homeomorphism H :D n —* D n . 

proof. Set H(tu) = th(u) for 0 < t < 1, where u denotes an 
arbitrary unit vector. 

Now if M is the union of two topologically imbedded disks 
<7o(T) n ), gx(D n ) which intersect precisely along their common 
boundary, we can first choose any homeomorphism from <7oOS n_1 ) = 
gi(S n to the equator of S n ) and then extend, using Lemma 1, to a 
homeomorphism which maps go(D n ) to the southern hemisphere and 
£i(S n_1 ) to the northern hemisphere. This completes the proof. 

Remark . Note that the differentiable structure is destroyed in 
the course of this proof. The reason is that there is no differenti¬ 
able analogue of Lemma 1. (Even if h:S n ~ l ^> S n ~ l is a diffeo- 
morphism, the extension we have constructed is highly nondifferenti- 
able at the origin, unless h happens to be orthogonal.) We will 
return to this point later, in Section 3. 

Following is another partial characterization of S n (compare 
[17, 8]). 

Theorem 2. Let M be a compact smooth manifold which is the 
union of two open sets f each diffeomorphic to a euclidean space . 
Then M is homeomorphic to a sphere. 

In fact the proof will show that M with a single point deleted is 
diffeomorphic to the euclidean space R n . The proof is based on two 
lemmas, both of which are interesting in their own right. 

Lemma 2 (Palais and Cerf). Let f\ and f 2 be smooth } orientation 
preserving imbeddings of the disk D n into the interior of a connected 
manifold M n . Then there exists a diffeomorphism h of M n onto 
itself so that h°fi = / 2 . 



168 Lectures on Modem Mathematics 


For the proof, which is quite elementary, the reader is referred to 
[3] or [14]. 

Lemma 3 ( Brown, Stallings). Let M be a paracompact manifold 
such that every compact subset is contained in an open set diffeomorphic 
to euclidean space. Then M itself is diffeomorphic to euclidean space. 

proof. (Compare [2, 21].) It is not difficult to show that M is 
a monotone union of disks. That is, we can find submanifolds with 
boundary 

Wt C W 2 C W z • • * C M 

with union M so that each Wi is diffeomorphic to D n , and so that 
each Wi is contained in the interior of W i+ 1. We wish to compare 
this sequence with the sequence 

D 1 n CD 2 n CD z n C * • * C R n 

where D™ denotes the disk of radius i in euclidean space. Start 
with any diffeomorphism f t : Dff —> W\. Using Lemma 2, this can 
be extended to a diffeomorphism f 2 : D 2 n —» W 2 , and so on. (To see 
this, consider the following diagram 


Wi 

* 


C Wi 


fi 


**+1 


W i+l 


Di n C D ?+' 


where g is an arbitrary orientation preserving diffeomorphism. 
Choosing a diffeomorphism h as in Lemma 2, we can now set / t+ i 
equal to h © g.) Finally, piecing together all these diffeomorphisms 
fi, we obtain the required diffeomorphism R n —» M. 

Using these lemmas we can prove a sharpened form of Theorem 1. 
Let M be compact, without boundary, and let /: M —> R be a 
smooth function with only two critical points. 

Theorem 1'. Even if these critical points are allowed to be degener¬ 
ate, it still follows that M is homeomorphic to S n . 

proof. (See Figure 2.) Let p and q be the critical points and 
let U be a neighborhood of p which is diffeomorphic to R n , with 
q ^ U. By deforming M into itself along the gradient lines of /, we 
can stretch U so that it covers any compact subset of M — q. 



Differential Topology 169 


Figure 2 


I 



Hence it follows from Lemma 3 that M — q is diffeomorphic to R n f 
which clearly completes the proof. 

The proof of Theorem 2 is similar. Let M be covered by open 
sets U and V which are diffeomorphic to R n (Figure 3). Given 
gGM”[/CFwe will show that any compact subset of M — q 
can be covered by an open set diffeomorphic to U . 

Since V is diffeomorphic to R n , it is easy to construct a diffeo- 
morphism V which (1) satisfies h(q ) - q, (2) shrinks the 



170 Lectures on Modern Mathematics 


compact set M — U down into an arbitrarily small neighborhood of 
q, and (3) coincides with the identity outside of a larger compact 
set. It follows from (3) that h extends to a diffeomorphism H of M. 
Clearly any compact subset of M — q can be covered by H(U) for 
suitable choice of the diffeomorphism H. Therefore it follows from 
Lemma 3 that M — q is diffeomorphic to R n . This completes the 
proof of Theorem 2. 

One of the most striking properties of the sphere S n is that the 
complement of each point is contractible. 

Problem 1 . Given a smooth manifold M such that M — p is contract - 
ible } does it follow that M is homeomorphic to a spheret 

It follows without too much difficulty from this hypothesis that 
M is compact and has the homotopy type of a sphere. Conversely, 
if M is compact, without boundary, and has the homotopy type of a 
sphere, it can be shown that M — p is contractible. 

If the dimension n of M is 0, 1 , or 2, then M must actually be 
diffeomorphic to S n . Here is a proof for n = 2. Choose a Rieman- 
nian metric and an orientation for M. By a classical theorem (the 
existence of “isothermal coordinates”) each small neighborhood in 
M can be mapped conformally and diffeomorphically onto a region 
in the plane. Thus M becomes a Riemann surface. Since M is 
simply connected, the classical “uniformization theorem” asserts 
that M is conformally diffeomorphic to either the complex plane, 
the open unit disk, or the Gauss 2-sphere. But only the last possi¬ 
bility satisfies our hypothesis. 

If the dimension n is 3, we have the classical Poincar6 problem. 
So far, all attempts to solve this problem have foundered. For 
n = 4 the problem is also unsolved. 

For n > 5 the problem has been solved affirmatively by Stallings 
and Zeeman [20, 24] and by Smale. In particular we have the 
following. 

Theorem of Smale [19]. If M is a smooth homotopy n-sphere of 
dimension n > 5, then M admits a Morse function with only two 
critical points . 

Hence Theorem 1 implies that M is homeomorphic to the 
n-sphere. 

The proof of this theorem is much more difficult than anything we 
have encountered so far. The basic idea can be outlined very 



Differential Topology 171 


Figure 4 


Before 


After 



roughly as follows. It is not difficult to construct a Morse function 
f:M —> R with many critical points. Furthermore / can be chosen 
so that the critical points occur in the proper order, in the following 
sense. If 

/(«!, constant — u\ — • ■ * — u\ + W\+i 

+ * * * + w „ 2 

in terms of local coordinates u\, . . . , u n near a critical point p } 
then the integer X = X(p) is called the index of this critical point. 
Now / can be chosen so that f(p) = X(p) for each critical point p . 
Thus the minimum points, with X = 0, occur in /” 1 (0), the maxi¬ 
mum points occur in and the remaining critical points come 

in between. 

The difficult part of the proof now consists in showing that a 
critical point of index X and a critical point of index X + 1 can 
sometimes be mutually cancelled. Thus in Figure 4 the critical 
point p of index 1 and the critical point q of index 2 can be mutually 
eliminated, by suitably changing the function. Repeating this 
procedure over and over, we eventually eliminate all critical points 
which are not essential in order to give the manifold M its proper 
homology groups. But a homotopy sphere has homology only in 
dimensions 0 and n. Hence we are left with only two critical 
points, with indices 0 and n respectively. 








172 Lectures on Modern Mathematics 


2. SOME EXOTIC SPHERES 

This section will construct an example of a manifold which is 
homeomorphic, but not diffeomorphic, to a sphere. The proof will 
be based on the Hirzebruch signature theorem. 

Let V be a compact oriented manifold of dimension 4 k. The 
signature (or index*) <r(F) is defined as follows, following Hermann 
Weyl [22, p. 41]. Let a, fi E H* k (V; Z) be homology classes with 
integer coefficients. Then the intersection number a • /3 is a well- 
defined integer. This intersection number is symmetric in a and /3 
since the dimension 2 k is even. Passing to real coefficients, we can 
choose a basis <*i, . . . , a, for the vector space H 2 k(V; R) so that 
the matrix of intersection numbers («,• • ay) is diagonal. Now the 
number of positive diagonal entries minus the number of negative 
diagonal entries is called the signature <r of V. 

The following fundamental observation is due to Thom. Suppose 
that V is the boundary of a compact oriented manifold W 4k+1 . Then 
the signature <r(V) is zero. Thom’s proof is based on the PoincarS 
duality theorem. 

We will also need to make use of the Pontrjagin classes of a smooth 
manifold M. Without attempting a definition, here are some basic 
properties. 

1. To each smooth manifold M there are associated cohomology 
classes p, E H 4i (M; Z) for i = 1 , 2, 3, . . . . 

2. If U is an open subset of M, then pjU) is equal to pjM) 
restricted to U. 

3. If M is parallelizable, then p,(M) = 0. 

Now let V be a closed oriented manifold of dimension 4 k. If i x + 
*2 + • ' • + = k, then the cohomology class Pi,Pi 2 • • ■ p ir E 

H ik (V ; Z) gives rise to an integer which is denoted by p it ■ ■ ■ pi r [V], 
These integers are called the Pontrjagin numbers of V. 

If F is a boundary then these Pontrjagin numbers are all zero. 
Conversely, we have the following: 

Thom Cobordism Theorem. If the Pontrjagin numbers Pi t pi 2 • • • 
Pi r [V] are all zero, then the m-fold disjoint sum V + V + ■ • • + V 
is the boundary of a compact oriented manifold, for some m > 0. 

* The term “index” is preferred by Hirzebruch and others. I have substituted 
signature to avoid confusion with the Morse index, as used in Section 1. 



Differential Topology 173 


An important consequence is the following. 

Hirzebruch Signature Theorem . The signature of any closed , oriented 
Ak-manifold can be expressed as a linear combination of its Pontrjagin 
numbers , where the coefficients are rational numbers which depend only 
on the dimension. In particular: 


<r(F 4 ) 

<r(V S ) 


gitn 

3 

7 pzlV*] - Pi 2 [V s ] 
45 


<r(V 12 ) 


62 p z [V 12 ] 
945 


*(V U ) 


127 p 4 [V ie ] , _ 

4725 


<r(F 20 ) 


146 p 5 [F 20 ] 
13365 "H 


w&ere the dots indicate terms in p 1 , . . . , p*— 1 . 

proof. Let F and F' be two manifolds with the same Pontrjagin 
numbers. Form F — F' (the disjoint sum in which the orientation 
of F' has been reversed). Then all Pontrjagin numbers of F — F' 
are zero. Hence some multiple 

(F — F') + * • • + (F - F') 


is a boundary. This implies that the signature 

<x((F - F') + • • • + (F - F')) = ma(F) - nur(F') 

is zero, and hence that <r(F) = <r(F'). 

Thus c(F) is a function of the Pontrjagin numbers of V, Since 
both signature and Pontrjagin numbers are integers which behave 
additively when we form disjoint sums, it is clear that this function 
must be linear with rational coefficients. 

The explicit formulas for k ~ 1, 2, 3, 4, 5 can be obtained by 
computing the signature and Pontrjagin numbers for suitable 
examples, and then solving the resulting linear equations. For 
further details the reader is referred to Hirzebruch [4]. 

For our purposes we will only need the following. A manifold 




174 Lectures on Modem Mathematics 


will be called almost parallelizable if it becomes parallelizable when a 
single point is removed. 

Corollary. The signature of a closed, almost parallelizable mani¬ 
fold of dimension 8 is always divisible by 7. Similarly, for a closed, 
almost parallelizable manifold of dimension 12, 16 or 20, the signa¬ 
ture is divisible by 62, 127, or 146 respectively. 

proof. We have p,(F 4 * - x) = 0, hence p t (F 4 *) = 0 for i < k. 
Thus the signature theorem reduces to 


and so on. 




In higher dimensions we obtain analogous results. (However, 
sharper results in this direction can be obtained by a different 
method: see [5].) 


Lemma 4- For k >2 there exists a compact parallelizable 4 k- 
dimensional manifold W with signature 4-8, such that the boundary of 
W is a homotopy sphere. 


proof. We will construct W in such a way that the matrix of 
intersection numbers is as follows: 

“2 1 0 0 0 0 0 0 " 

1 2 1 0 0 0 0 0 

0 1 2 1 0 0 0 0 

0 0 1 2 1 0 0 0 

00012101 
0 0 0 0 1 2 1 0 

0 0 0 0 0 1 2 0 

_0 0 0 0 1 0 0 2 _ 

This remarkable matrix was suggested to the author by Hirzebruch. 
Note that it is positive definite, with determinant +1, and has only 
even entries on the diagonal. 

As basic building block for the manifold W we take a tubular 
neighborhood T of the diagonal in S 2k X S 2k . The homology group 
Z) is infinite cyclic, and the intersection number of a gener- 



Differential Topology 175 



*2 

Figure 5 


ator with itself is +2. [proof: if a , j 3 £ H2k ( S 2k X S 2k ) are the 
obvious generators, then (a + /3) * (a + p) — 2a * 0 = 2.] 

Proof that T is parallelizable. Note that S 2k X S 2k can be 
imbedded in R 4k+1 (as the boundary of a neighborhood of S 2k ). 
Hence its tangent bundle is induced from that of the unit sphere by 
means of the Gauss map g: S 2k X S 2k —* S ik . Since g\T is homo¬ 
topic to a constant, it follows that T is parallelizable. 

Next we introduce the operation of "plumbing together” two 
copies Ti and T 2 of T. By this we mean the operation of matching 
together a region in Ti and a region in T 2 in such a way that the 
central 2fc-sphere of T i will have intersection number +1 with the 
central 2fc-sphere of T 2 . We must then “round off” the corners, so 
as to obtain a smooth manifold with boundary. For the 2-dimen¬ 
sional case this operation is illustrated in Figure 5. 

In practice we need not two but eight copies of T. These are to 
be plumbed together according to the following schema: 

1 2 3 4 5 6 ? 


That is, Ti is to be plumbed together with Ti and T s ; T 5 is to be 
plumbed together with T 4 , T 6 , and T s ; and so on. In this way we 
obtain a smooth manifold W which clearly has the following 
properties: 

1. The manifold W has the homotopy type of a bouquet consist¬ 
ing of eight copies of S 2k which intersect at a single point. 


176 Lectures on Modem Mathematics 


2. If ai, . . . , ag €E H 2k (W;Z) is the corresponding homology 
basis, then the matrix (a* • ay) of intersection numbers is as indi¬ 
cated previously, positive definite with determinant +1. 

Thus it follows that the signature of W is +8. 

3. The boundary dW is a homology sphere. 

For the Poincar6 duality theorem implies that 

for i = 2k 

Hi(W,dW;Z) = j Z for i = 4 k 

\ 0 otherwise 

Furthermore the homomorphism 

H 2k (W; Z) -> H 2k (W, dW; Z) 

corresponds to our intersection matrix, and hence is an isomorphism. 
Now the homology exact sequence of the pair ( W , d W) shows that 
(dW; Z) a* H*(S 4k - 1 ; Z). 

4. dW is simply connected. 

proof. Any circle in dW bounds a disk in W . By a general 
position argument, since 2k + 2 < 4 k, this disk can be pushed off of 
the central 2&-spheres of the tubes T i} and hence can be pushed into 
the boundary of W. 

It now follows, by standard arguments in homotopy theory, that 
dW is a homotopy sphere. Since W is clearly parallelizable, this 
completes the proof of Lemma 4. 

Theorem 3. The manifold dW is homeomorphic to the sphere 
S 4 *- 1 but is not diffeomorphic to S 4k ~ l . 

proof. It follows immediately from Smale or Stallings that the 
homotopy sphere dW is homeomorphic to >S 4fc_1 . (Recall that the 
dimension 4 k — 1 is >7.) 

We will only prove the second statement for the special cases 
k — 2, 3, 4, 5. Suppose that dW were diffeomorphic to S 4 *"" 1 . 
Then by pasting a 4&-disk onto the boundary of W we would obtain 
a smooth closed manifold V = W \J D 4k . Clearly V is almost 
parallelizable. Thus, according to the corollary given earlier, the 
signature of V must be a multiple of 7 or of 62 or 127 or 146 respec¬ 
tively. But <r(V) = <r(W) = 8. This contradiction completes the 



Differential Topology 177 


proof for 2 < k < 5. For higher values of k the reader is referred 
to [6], 

3. THE GROUP T n 

Since we have seen that exotic spheres exist, it is natural to try to 
classify them. For this purpose we introduce an abelian group T n 
which can be described in two different ways. 

First Description . Let Diff + (L> n ) be the group of all orientation 
preserving diffeomorphisms from a closed n-disk onto itself, and let 
Diff + (S 7l ~ 1 ) be the corresponding group of diffeomorphisms of its 
boundary. Consider the restriction homomorphism 

r : Diff + (D n ) -> Diff + (S n “ 1 ) 

We assert that the image of this homomorphism is a normal sub¬ 
group. Hence the quotient group 

Diff+^-i) 

r ” “ r Diff + (Z> n ) 

is defined. 

Second Description . A closed oriented manifold M will be called a 
twisted sphere if it admits a Morse function with two critical points. 
Let r* denote the set of all oriented-diffeomorphism classes of 
twisted n-spheres. To make this into a group we introduce the 
connected sum operation. 

Given connected oriented n-manifolds M\, M 2f choose imbeddings 
f x :D n ->M u f 2 :D n ->M 2 

To make the orientations come out right it is important that one of 
these two imbeddings should preserve orientation and the other one 
should reverse orientation . The connected sum Mi # M 2 is now 
formed from 

[Mi - A(Int D n )) W [M 2 - / 2 (Int D n )) 
by pasting together the two boundaries under the diffeomorphism 

Better still, in order to make the differentiable structure clear, 
extend /i and f 2 to imbeddings Fi\R n —> Mi. Then M\ # M 2 can 




178 Lectures on Modem Mathematics 


be formed from 

[M x - F x { 0 )] \J [M 2 - F 2(0)] 

by identifying each F x (x) with F 2 (x/\\x\\ 2 ). (Note that this cor¬ 
respondence preserves orientation.) Applying Lemma 2 to Fi 
restricted to some large disk, we can prove that M x # M 2 is well 
defined up to orientation preserving diffeomorphism. (Compare 
[8, 6].) 

To compare the group T n with T f n we will construct a homo¬ 
morphism 

Diff+os"- 1 ) r; 

Given h £ Diff 4 ^”"" 1 ) let M(h) be the twisted sphere which is 
obtained from two copies of R n by identifying each x £ R n — 0 in 
the first with y — A(a;/||o;||)/||x|| in the second. Then 

, _ M* _ 1 

J 1 + N | 2 1 + |M | 2 

is a Morse function with only two critical points. Taking the 
orientation of MQi) from the first copy of R n we obtain a well- 
defined twisted sphere. 

It is easily verified that this construction defines a homomorphism 
from Diff+OS” -1 ) onto (Compare [7, p. 402].) Let us look at 
the kernel. Suppose that there is an orientation preserving diffeo¬ 
morphism g from M(h{) to M(h 2 ). According to Lemma 2 we may 
assume that g carries the point of M(h{) with coordinate x to the 
point of M(h 2 ) with the same coordinate x for all \\x\\ < 1. In 
terms of the y coordinates, this means that g carries the point with 
coordinate y = thi(ti) to the point with coordinate y = th 2 (u) for 
all t >1 and all u £ £ n-1 . Thus we have a diffeomorphism R n —■» 
R n which takes S into itself by the diffeomorphism h 2 0 h {” 1 . 
Hence h 2 0 h x ~ x belongs to the image 

r Diff + (Z> n ) C Diff+OS"- 1 ) 

Conversely it can be shown that any element of the image 

gives rise to a manifold M(h\S n ~ l ) which is diffeomorphic to S n . 
(The proof can be based on Munkres [13, §6.1].) Henceforth we will 
drop the prime , and identify r n with F*. 




Differential Topology 179 


The main properties of these groups can be described as follows. 

Theorem 4- The group r n is finite abelian for all n. For n < 6 
we have r n = 0, but for n = 7 the group T 7 ~ is non-zero . 

In fact it follows from Theorem 3 that r 4 fc_i ^ 0 for all k > 2. 

The fact that r n = 0 for n < 3 is due to Munkres. For n = 4 
this assertion has recently been announced by J. Cerf. In these 
cases the proof is based on the first definition, in terms of diffeo- 
morphisms of S n ~~ 1 . For n > 5 the proof is based rather on the 
second definition, in terms of twisted spheres (see [6]). Some 
indication of the method, for n > 5, is given in the following section. 

4. HOW TO RECOGNIZE AN HONEST SPHERE 

Let M be a twisted n-sphere. How can we decide whether or not 
M is diffeomorphic to the standard n-sphere? 

First choose an imbedding M C R n+k for some large value of fc. 
This is possible by a well-known theorem of Whitney [23]. 

Lemma 5. The normal bundle of M is trivial . That is y there exist k 
continuous linearly independent normal vector fields . 

outline of proof. Since M — x is contractible, it is certainly 
possible to choose such vector fields in the complement of x. Now 
the “obstruction” to extending over x is described by an element of 
the homotopy group ir n _iSO(fc). These groups (for k large) have 
been computed by Bott [1]. 

Case 1. If n = 3, 5, 6, or 7 (mod 8), then ir n -xSO(k) = 0, hence 
there is no obstruction. 

Case 2. If n = 4 i then the group 7r n _iSO(fc) is infinite cyclic. 
In this case the obstruction class can be identified with a multiple of 
the Pontrjagin number Pi[M). (See [5].) But this number is zero 
by the signature theorem. 

Case 3. If n s 1 or 2 (mod 8), then ir n -iSO(A;) = Z 2 . The 
proof in this case is more delicate: the obstruction class o satisfies 
J(o) = 0 according to [5], but J. F. Adams has shown that the J 
homomorphism has kernel zero for these values of n. This com¬ 
pletes the proof. 

Next we must introduce the concept of framed cobordism . By a 
framing <p of an imbedded manifold M n C R n + k j * large, will be 




180 Lectures on Modern Mathematics 


meant a set of k linearly independent normal vector fields. Two 
framed manifolds (M 0f <p 0 ) and (M h <p x ) in R n + k are called framed 
cobordant if there exists a compact framed manifold (W , \f/) of dimen¬ 
sion n + 1 in R n + k X [0, 1] with dW = M 0 X OKJ M\ X 1, where 
the framing ^ restricts to <? 0 at one end and to <p x at the other. Using 
the disjoint sum as composition operation, the framed cobordism 
classes of n-manifolds form a group, which will be denoted by n n . 

There are two fundamental theorems concerning these groups. 

Theorem of Pontrjagin [15]. The framed cobordism group II n is 
canonically isomorphic to the stable homotopy group Tr n +j c (S k ) > n < 
k - 1. 

Theorem of Serre [18]. The stable homotopy groups 7 r n ^ k (S k ) y 
0 < n < k — 1 , are all finite abelian groups . 

Thus U n is finite. 

Consider the class of all framed twisted n-spheres in R n+k . 
Using a suitably defined connected sum operation, and defining an 
appropriate concept of “isomorphism,” we see that the set of all 
isomorphism classes of framed twisted n -spheres forms an abelian 
group , which will be denoted by Fr n . There is an exact sequence 

7r n SO(fc) —► Fr n —► r n —» 0 

proof. Every twisted sphere can be framed. The kernel of the 
homomorphism Fr n —> T n is obtained by looking at standard 
spheres with exotic framings. But a framing of the standard 
S n C R n+k is clearly described by an element of 7 r n SO(fc). 

On the other hand, there is clearly a homomorphism 

j:FT n -+Il n 9*T n + k (S k ) 

Kervaire has shown that this homomorphism j is onto, except 
possibly when n = 2 (mod 4). Thus every framed cobordism class 
contains a twisted sphere , except in dimensions 2, 6, 10, . . . . 

Consider the kernel of this homomorphism j. Suppose, for 
example, that n s 3 (mod 4). Then we claim that kernel (J) is 
infinite cyclic. Let (M, <p) be a framed twisted n-sphere belonging 
to the kernel of j. Then (M, <p) = d(W, where W C R n+k X 
[0, 1) is a compact framed manifold of dimension n + 1 = 4i. 
Hence the signature a(W) is defined. 

This integer a(W) is an invariant of the framed manifold (M, ip). 




Differential Topology 181 


For if (M , <p) is also the boundary of (W', then by placing W in 
R n + k X [0, 1) and W* in R n+k X (—1, 0], we can construct a closed 
framed manifold 

W KJ W' C R nJrk X R 

Giving W \J W* the orientation which is compatible with that of W, 
we see that 

<r(W U W f ) = <r(W) - <r(W') 

But this signature must be zero: the fact that W VJ W f is framed 
implies that its Pontrjagin classes are all zero and hence, by the 
signature theorem, that its signature is zero. 

Thus (x(W) is an invariant 9 of ( M , <p). In this way we define a 
homomorphism 

9 : kernel ( j ) —» Z 

The construction of Section 2 shows that this homomorphism 9 is 
nontrivial.* 

Finally we claim that the kernel of 9 is zero. That is, if (M, <p) 
bounds (T7, $) with <r(W) = 0 , then M is diffeomorphic to S n , and 
(p corresponds to the standard framing of S n . The proof, which is 
fairly formidable, is divided into two parts. First, using the 
method of “surgery” [9], we show that all of the homotopy groups of 
W can be killed. In other words W can be replaced by a contract¬ 
ible manifold W\ Second, a theorem of Smale [19] asserts that 
such a manifold W must be diffeomorphic to the (n + l)-disk. 
No further details are given here. 

Thus kernel (j) is infinite cyclic. In other words there is an 
exact sequence 

0 —► Z —> Fr—»II 4 J —1 —> 0 

It follows that Fr 4 i_i is an infinite abelian group of rank 1 . But 
7 T 4 i__iSO(A 0 = Z is also a group of rank 1. Using the sequence 

7T 4 i-iSO(ft) — Fr 4i -i r 4i _! -> 0 

we finally see that the group r 4t _i is finite. 

* It is curious that 9 behaves somewhat differently in dimension 3 than in 
higher dimensions. Thus the image of 9 is generated by 8 if n « 4i — 1 >7, 
but is generated by 16 if n = 3. 


182 Lectures on Modem Mathematics 


Similar constructions work for other values of n. Thus 
0 —> FT n —»—> 0 for 7i sb 0 (4) 

Z* — Fr n -> 7 r n + k S h 0 for n = 1 (4) 

0 > Fr w — > TTn+kS^ — ► Z2 for n s 2 (4) 
0->Z 2 ^Fr n ^7r n+ *S*-+0 for n = 3 (4) 

For further information on these groups, the reader is referred to [6], 


REFERENCES 

1. R. Bott, “The stable homotopy of the classical groups,” Ann. of Math. 
70 (1959) 313-337. 

2. M. Brown, “The monotone union of open n-cells, is an open n-cell,” Proc. 
Amer. Math. Soc. 12 (1961) 812-814. 

3. J. Cerf, “Topologie de certains espaces de plongements,” Bull. Soc. Math. 
France 89 (1961) 227-380. 

4. F. Hirzebruch, Neue topologische Methoden in der algebraischen Geometries 
Berlin (1956) Springer. 

5. M. Kervaire and J. Milnor, “Bernoulli numbers, homotopy groups and a 
theorem of Rohlin,” Proc. Intern. Congr. Math. f Edinburgh (1958). 

6. -, “Groups of homotopy spheres: I,” Ann. of Math. 77 (1963) 504-537. 

7. J. Milnor, “On manifolds homeomorphic to the 7-sphere,” Ann. of Math. 
64 (1956) 399-405. 

8. -, “Sommes de varies diff6rentiables et structures differentiates des 

spheres,” Bull. Soc. Math. France 87 (1959) 439-444. 

9. -, “A procedure for killing the homotopy groups of differentiable 

manifolds,” Symposia in Pure Math. A.M.S. vol. Ill (1961) 39-55. 

10. -, “A survey of cobordism theory,” VEn&eignement Math.- 8 (1962) 

16-23. 

11. -•, Morse theory , Princeton, N.J. (1963) Princeton University Press. 

12. M. Morse, “Relations between the numbers of critical points of a real func¬ 
tion of n independent variables,” Trans. Amer. Math. Soc. 27 (1925) 
345-396. 

13. J. Munkres, Elementary differential topology , Princeton, N.J. (1963) 
Princeton University Press. 

14. R. Palais, “Extending diffeomorphisms, 77 Proc. Amer. Math. Soc. 11 (1960) 
274-277. 

15. L. Pontrjagin, “Smooth manifolds and their applications in homotopy 
theory, 77 Trudy Mat. Inst. im. Steklov 46 (1955), and A.M.S. translations, 
Series 2, 11, 1-114. 

16. G. Reeb, Sur certaines propriUis topologiques des varUUs feuilUties, Paris 
(1952) Hermann. 

17. R. Rosen, “A weak form of the star conjecture for manifolds,” Abstract 
570-28, Notices Amer. Math. Soc . 7 (1960) 380. 




Differential Topology 18S 


18. J.-P. Serre, “Homologie singulifcre des espaces fibres. Applications,” Ann . 
of Math. 64 (1951) 425-505. 

19. S. Smale, “On the structure of manifolds,” Amer . J. Math. 84 (1962) 
387-399. 

20. J. Stallings, “Polyhedral homotopy spheres,” Bull. Amer. Math. Soc. 66 
(1960) 485-488. 

21. -, “The piecewise linear structure of Euclidean space,” Proc. Cam¬ 

bridge Philos. Soc. 68 (1962) 481-488. 

22. H. Weyl, “Andlisis situs combinatorio,” Revista Mai. Hispano-Americana t 
6-6 (1923-1924), 209-218ff. 

23. H. Whitney, “Differentiable manifolds,” Ann. of Math. 37 (1936) 645-680. 

24. C. Zeeman, “The generalized Poincar4 conjecture,” Bull. Amer. Math. 
Soc . 67 (1961) 270. 


Lectures on 
Modern Ma thenvaries 

* 

/ u i k ree / olit rn es 

f\tl iiffi by ' I'. L. S Aat v. Jr ms (J'jntntl flit ■■■ |l itisitrmamt'ni . / yt wry* 

At I those desiring tu keep abreast of the major nehieve- 
rucTilrH i i. tht: various fir Ids of mathematics will weiemne 
this work-. Not only will it t-uvur virtually every principle 
area id modern pniithnuaULW hut it is written to readi as 
wide ait audience ns possible iip-rl to make readily accrsrihlr 
a maximum amount of material. 

Each of :he eminent contributors delineates a substantial 
resea roll area, lie describes it broadly ami comprehensively 
,'LTid contributes to this drscnplmn his individual evaluation 
of iho atfsirhet£c and practical aliens of the field 1 its posi- 
rion in mathematical development as a whole, and it> 
furure, -a-.- it is implied in -he i onjeci uml cxpodiinn of 
i i- on sol ved pxobkms. 

Idic moiioft rnjihs rise expanded versions of expository 
lectures, conceived :In 1 1 nr^niuxeil hv l he editor find givCFl 
at 1:3a: George VVashm&Eon University under the sponsor- 
S-h i .1 hi r: i c 1 hi: vc t s i h i ;*: ' i,: OJ: -.'i i ■ t N li\ I K: -n . i . 

In writing the monographs, ihe lecturer^ have successful !\ 
compressed the mormons ^mnant u: material at their 
cm tin Land and at the smue Lime hnve ;; reserved nn informal 
approach, have refrained from any inlricatu excursion.-, 
that might intimidate the novice, and hiive given each 
monograph the personal flavor of risc.ir own [involvement 
in mathematical research. 


JOHN WILEY & SONS 


Mjr yoRK 


LONDON 


SYDXE) 










